I am using sklearn for a machine learning project, and one of the columns is in categorical form. I would like to convert it into numerical form with an ordinal encoder, and then impute the missing data. Sklearn's OrdinalEncoder throws an error:
ValueError: Input contains NaN
but I would really rather not use the categorical imputer first and then convert the values into numbers, because it is much less suited to the nature of the data. Is there any way around this?
here is the code:
from sklearn.preprocessing import OrdinalEncoder
ordinalenc = OrdinalEncoder()
imd = ordinalenc.fit_transform(info[["imd_band"]])
print(ordinalenc.categories_)