Error using Sklearn in a for loop

Question

I am running Python 3, and when I attempt to run this code:

from sklearn.preprocessing import LabelEncoder
cv=train.dtypes.loc[train.dtypes=='object'].index
print (cv)

le=LabelEncoder()
for i in cv:
    train[i]=le.fit_transform(train[i])
    test[i]=le.fit_transform(test[i])

However, i get this error:

le=LabelEncoder()
for i in cv:
    train[i]=le.fit_transform(train[i])
    test[i]=le.fit_transform(test[i])


Traceback (most recent call last):

  File "<ipython-input-5-8739984f61b2>", line 3, in <module>
    train[i]=le.fit_transform(train[i])

  File "C:\Users\myname\Anaconda3\lib\site-packages\sklearn\preprocessing\label.py", line 127, in fit_transform
    self.classes_, y = np.unique(y, return_inverse=True)

  File "C:\Users\myname\Anaconda3\lib\site-packages\numpy\lib\arraysetops.py", line 195, in unique
    perm = ar.argsort(kind='mergesort' if return_index else 'quicksort')

TypeError: unorderable types: str() > float()

Oddly enough, if I call the encoder on a specified column in my data, the output is successful. For instance:

le.fit_transform(test['Race'])

Results in:

le.fit_transform(test['Race'])
Out[7]: array([2, 4, 4, ..., 4, 1, 4], dtype=int64)

I've tried: float(le.fit_transform(train[i])) str(le.fit_transform(train[i]))

Both have not worked.

Could someone please provide help me out?

Do you have empty values and then some categorical values in the columns? Like, say `Nans` and then categorical values in a particular column. Those columns still get recognized as `object` dtype and when you call `fit_transform` method on that, you get the TypeError due to both `str` and `float`(NaN) values present in them. — Nickil Maveli, Sep 16 '16 at 09:21

Error using Sklearn in a for loop

0 Answers0