0

Im trying to make a confusion matrix to determine how well my model performed. I split my model into x and y testing and training set however, to make my confusion matrix, I need the y_test data(the predicted data) and the actual data. Is there a way I can see the actual results of the y_test data. Heres a little snippet of my cod:

x_train, x_test, y_train, y_test = train_test_split(a, yy, test_size=0.2, random_state=1)


model = MultinomialNB() #don forget these brackets here
model.fit(x_train,y_train.ravel())

#CONFUSION MATRIX
confusion = confusion_matrix(y_test, y_test)
print(confusion)
print(len(y_test))

1 Answers1

0

Your y_test is the actual data and the results from the predict method will be the predicted data.

y_pred = model.predict(x_test)

confusion = confusion_matrix(y_test, y_pred)
ywbaek
  • 2,971
  • 3
  • 9
  • 28
  • I might be understanding this wrong but I thought y_test is the prediction. When you split your data, you have the small x_testing sample and your model makes predictions based on the x_test data resulting in the y_test. – ralph_cifarello Apr 13 '20 at 22:06
  • @amanaman there is no prediction involved when splitting the data. `train_test_split` function literally splits the data into `train` and `test` datasets. You are right in saying that the model makes predictions based on the `X_test` data. However, what you do after is to compare those predicted data with `y_test` which is the actual data. – ywbaek Apr 14 '20 at 00:12