I have been trying to use machine learning to predict some data but it shows me can not convert str into int error, I even tried label encoder but I am still not able to successfully run the program.
I have tried label encoding
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import LabelEncoder
gender_data = pd.read_csv('gender.csv')
le = LabelEncoder()
X = gender_data.drop(columns=['Gender'])
y = gender_data['Gender']
Xv = X.values
yv = y.values
le_encoder_X = le.fit(Xv)
le_encoded_X = le.transform(Xv)
le_encoder_y = le.fit(yv)
le_encoded_y = le.transform(yv)
X_train, X_test, y_train, y_test = train_test_split(le_encoded_X, le_encoded_y, test_size=0.2)
model = DecisionTreeClassifier()
model.fit(X_train, y_train)
ValueError Traceback (most recent call last) in () 17 yv = y.values 18 ---> 19 le_encoder_X = le.fit(Xv) 20 le_encoded_X = le.fit(Xv) 21
F:\Anaconda\lib\site-packages\sklearn\preprocessing\label.py in fit(self, y) 93 self : returns an instance of self. 94 """ ---> 95 y = column_or_1d(y, warn=True) 96 self.classes_ = np.unique(y) 97 return self
F:\Anaconda\lib\site-packages\sklearn\utils\validation.py in column_or_1d(y, warn) 612 return np.ravel(y) 613 --> 614 raise ValueError("bad input shape {0}".format(shape)) 615 616
ValueError: bad input shape (66, 4)