sklearn 0.24.2, python = 3.8
While trying to use decision tree regressor using sklearn I've came across common problem. I've read lots of questions however there isn't any definitive answer.
I want to handle categorical(non-ordinal, high cardinality) column however using:
- OrdinalEncoder leads to assigning orders such as 1 < 2< 3, and so on... which is a problem since my column do not have any order
- OneHotEncoder leads to high dimensions. Leading to higher depth tree and more computations.
is there no way to use sklearn? if not, what other libraries are recommended?
from Can sklearn DecisionTreeClassifier truly work with categorical data? it says I could use BinaryEncoding however this seems to be used when cardinality=2.
Also in R's library(tree) it handles categorical column without any preprocessing, how can this be done in similar way in python?