0

I'm working on a ML webapp and am training data from a CSV file. When converting the data array to float the ValueError appears

CODE X[:, 0] = le_country.transform(X[:,0]) X[:, 1] = le_education.transform(X[:,1]) X = X.astype(float) X

ERROR

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
Cell In [54], line 1
----> 1 X[:, 0] = le_country.transform(X[:,0])
      2 X[:, 1] = le_education.transform(X[:,1])
      3 X = X.astype(float)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\preprocessing\_label.py:138, in LabelEncoder.transform(self, y)
    135 if _num_samples(y) == 0:
    136     return np.array([])
--> 138 return _encode(y, uniques=self.classes_)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\utils\_encode.py:226, in _encode(values, uniques, check_unknown)
    224         return _map_to_integer(values, uniques)
    225     except KeyError as e:
--> 226         raise ValueError(f"y contains previously unseen labels: {str(e)}")
    227 else:
    228     if check_unknown:

ValueError: y contains previously unseen labels: 'United States'

1 Answers1

0

If you are fitting an encoder then you should use:

from sklearn import preprocessing
le = preprocessing.LabelEncoder()
le.fit([1, 2, 2, 6])

You are probably using the encoder without having it been fit or the new data which you are using to train a model does not have the labels ('United States') which you fitted the encoder with.

André Guerra
  • 486
  • 7
  • 22