Skip to content

Latest commit

 

History

History
12 lines (12 loc) · 635 Bytes

README.md

File metadata and controls

12 lines (12 loc) · 635 Bytes

Dirty-dataImpacts

Codes&Datasets Codes: Contain 6 Classification algos; 6 Clustering algos; 5 Regression algos. All codes are written in C++. ps. LogisticRegression is used for both Classification and Regression. Datasets: Contain 5 Classification original datasets; 5 Clustering original datasets; 5 Regression original datasets. Dirty data are injected into Original Datasets: Contain Missing Data; Inconsistency Data; Conflict Data. Missing rate vaires from 10% to 50%; Inconsistency rate varies from 10% to 50%; Conflict rate vaires from 10% to 50%. If you have any question, please email to [email protected]. Enjoy it!