Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Something wrong with labeling in classifier #60

Open
ArtificialNotImbecile opened this issue Nov 16, 2017 · 2 comments
Open

Something wrong with labeling in classifier #60

ArtificialNotImbecile opened this issue Nov 16, 2017 · 2 comments

Comments

@ArtificialNotImbecile
Copy link

ArtificialNotImbecile commented Nov 16, 2017

when I use this data set
in classification problem. The metric=accuracy was below 0.4 and the prediction in test data is about 0.5(random guess, however I use the same algorithm in python gives me about 0.6 accuracy), auc metric behaves consistent with python sklearn algorithms during run phase. But the prediction in the test data does not match the auc given by stacknet command line information.(Again, it looks like a random guess)

One possible problem is that the data use (-1,1) to encode the target class, which is not normal but since sklearn can handle this pretty well I really hope stacknet can do this as well!

@kaz-Anova
Copy link
Owner

Could you share the whole command you used, parameters file and if possible the dataset (if you did any changes to that one). In principle, it should work with -1/1, but there could be a bug.

@ArtificialNotImbecile
Copy link
Author

ArtificialNotImbecile commented Nov 16, 2017

params
command I used

And I made a mistake, the prediction accuracy and auc score is normal in the test data, but command line information about accuracy is wrong. The problem exist when running the original data. Thanks!

p.s. I change the params.txt file a lot to test whether there is actually a problem. When use logistic regression the prediction accuracy is about 0.4, but in SklearnRandomforest is close to 0.5 and sometimes beyond 0.5(But notice that logistic regression actually performs better than randomforest)
@kaz-Anova

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants