I have been trying to use a categorical inpust in a regression tree (or Random Forest Regressor) but sklearn keeps returning errors and asking for numerical inputs.
import sklearn as sk
MODEL = sk.ensemble.RandomForestRegressor(n_estimators=100)
MODEL.fit([('a',1,2),('b',2,3),('a',3,2),('b',1,3)], [1,2.5,3,4]) # does not work
MODEL.fit([(1,1,2),(2,2,3),(1,3,2),(2,1,3)], [1,2.5,3,4]) #works
MODEL = sk.tree.DecisionTreeRegressor()
MODEL.fit([('a',1,2),('b',2,3),('a',3,2),('b',1,3)], [1,2.5,3,4]) # does not work
MODEL.fit([(1,1,2),(2,2,3),(1,3,2),(2,1,3)], [1,2.5,3,4]) #works
To my understanding, categorical inputs should be possible in these methods without any conversion (e.g. WOE substitution).
Has anyone else had this difficulty?
thanks!