0

I'm using sklearn Random Forrest to train my model. With the same input features for the model I tried passing the target labels first with label_binarize to create one hot encodings of my target labels and second I tried using label_encoder to encode my target labels. In both cases I'm getting different accuracy score. Is there a specific reason why this is happening, as I'm just using a different method to encode the labels without changing any input features.

drew_psy
  • 95
  • 8

2 Answers2

0

It is not because of label, but the randomness of Random Forest.

Try fix the random_state to avoid this situation.

Gilseung Ahn
  • 2,598
  • 1
  • 4
  • 11
0

https://datascience.stackexchange.com/questions/74364/random-forrest-sklearn-gives-different-accuracy-for-different-target-label-encod

Basically when you encode your target labels as one hot encoding sklearn treats it as a multilabel problem as compared to label encoder which gives an 1d array where sklearn treats it as a multiclass problem.

https://scikit-learn.org/stable/modules/multiclass.html

drew_psy
  • 95
  • 8