0

Whenever i am trying to execute the following code it is showing ValueError: y contains previously unseen labels: 'some_label'

X_test['Gender'] = le.transform(X_test['Gender'])
X_test['Age'] = le.transform(X_test['Age'])
X_test['City_Category'] = le.transform(X_test['City_Category'])
X_test['Stay_In_Current_City_Years'] 
=le.transform(X_test['Stay_In_Current_City_Years'])
Nil
  • 1
  • 1
  • 3

1 Answers1

1

I am not really sure what is your whole code is but I think the problem is your train data is different from your test data, meaning when you are using "transform" there is some data point in test that was not available while you fit your transformer on "Train" data.

Lets see it with an example. If you notice I have fitted (trained) my ColumnTransformer with OneHotEncoder on train data and when I will use it to transform my test data it will through an error because it has never seen value Z which is present in test but not in train dataset :

import pandas as pd 
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import make_column_transformer

df = pd.DataFrame(['a','b','c','a','b','z'], columns=['c1'])

train = df[:3]

test = df[3:]

cl = make_column_transformer((OneHotEncoder(),train.columns)) 

cl.fit(train)

cl.transform(test)

This will through below error:

ValueError: Found unknown categories ['z'] in column 0 during transform

Abhishek
  • 1,585
  • 2
  • 12
  • 15
  • Thats the solution really on such senarios? I doubt deleting the whole feature is ideal. – Walker Aug 26 '22 at 18:41
  • You dont need to delete feature the but need to make sure, train & test contain all the instances of column value that need to be onehotencoded. Quoting from documentation "transform the data to a binary one-hot encoding", at the time of transform how do system works if there is a unknown category because based on the number of categories length of binary encoding will get decided & that is what system learns during ".fit" – Abhishek Aug 26 '22 at 19:24