for a project I'm using a classifier, with XGBoost. Here is a part of the code:
import ...
tab = 'tab.csv'
datasset = read_csv(tab, decimal=".")
target_attribute = dataset['AVG']
a = random.randrange(1, 1000)
seed = a
test_size = 0.33
X_train, X_test, y_train, y_test = train_test_split(dataset, target_attribute, test_size=test_size,
random_state=seed)
X_train = np.array(X_train)
X_test = np.array(X_test)
X_train = X_train.astype(float)
X_test = X_test.astype(float)
model = XGBClassifier()
model.fit(X_train, y_train)
print(model)
y_pred = model.predict(X_test)
predictions = [round(value) for value in y_pred]
accuracy = accuracy_score(y_test, predictions)
print("Accuracy: %.2f%%" % (accuracy * 100.0))
As soon as I use a certain target_attribute I get the following error:
ValueError: Classification metrics can't handle a mix of continuous and multiclass targets
I'm using classification, so my internet search so far didn't really help solving this problem. I think that the problem might be in the fact that in this column in the .csv the values are both integer and real numbers. I've no idea how to solve this problem. I hope that somebody here can help me.
EDIT: I already tried to force all columns with dtype == 'int64' to be the dtype == 'float64'. This sadly didn't help.