uv = np.unique(X[:, 2])
uv2 = np.unique(X_test[:, 2])
print(uv)
#['Female' 'Male']
print(uv2)
#['Female' 'Male']
# Encoding categorical columns in the train dataset
from sklearn.preprocessing import LabelEncoder
labelencoder_X = LabelEncoder()
X[:, 2] = labelencoder_X.fit_transform(X[:, 2]) # Encoding column 2
# Encoding categorical columns in the test dataset
X_test[:, 2] = labelencoder_X.transform(X_test[:, 2]) # Encoding column 2
Result of last command:
ValueError: y contains previously unseen labels: 'Male'
I tried to mask the the unseen values and the result of X_test
afetr encoding is empty.