I have a data frame which i will denote df for now and i obtain an ndarray as follows
X=df.iloc[:,5:].values
which i want to use for a machine learning model. I need to one-hot-encode the 12th column of X.
Using sklearn i first labelencoded it as follows
from sklearn.preprocessing import LabelEncoder,OneHotEncoder
labelencoder_x=LabelEncoder()
df[:,12]=labelencoder_x.fit_transform(df[:,12])
and this works fine.
Next i try one-hot-encoding as follows
onehotencoder=OneHotEncoder(categorical_features=[12])
X=onehotencoder.fit_transform(X).toarray()
and i get the following error
ValueError: Input contains NaN, infinity or a value too large for
dtype('float64').
Could someone help me on this, i'm new to programming in python and am eager to learn what is wrong with what i did and how i can fix it. I tried doing some debugging by seeing if np.nan is in the 12th column and i get False, i also checked the type of each element in the 12th column and it is int.