0

enter image description hereI'am running a decision tree classifier on the data within the picture. In the picture you can see that there are type's of data like time signature and signature key that need to be one hot encoded with 1's and 0's. However, within the dataframe all 0 and 1's are of type float. Therefore, my decision tree classifier is unable to make the distinction between if a feature is present or not, but makes a classification if a feature is useful by using 0.5's as can be seen in the second picture. How to fix this?

Thank's already in advanceenter image description here

I already tried turning all float into int's, but didn't exactly figure out how

1 Answers1

0

Decision trees don't classify by checking if a feature is present or not. Binary values that are used in the tree should be separated by using some threshold (such as 0.5), this way, 1s will be on one side and 0s will be on the other.

This is the way trees operate, there is no bug.

Here is StatQuest about classification decision trees: https://www.youtube.com/watch?v=_L39rN6gz7Y

Matan Bendak
  • 128
  • 6