I have preprocessed some data ready to train a Multinomial Naive Bayes classification. The train data is 80% of my data and the test data is 20%.
The train data is an array of size 8452 and the test data is an array of size of 4231
If I want to see the predictions of train data I execute the following code just fine
multiNB = MultinomialNB()
model = multiNB.fit(x_train, y_train)
y_preds = model.predict(x_train)
but if I want to predict my test i.e.
y_preds = model.predict(x_test)
I get the following error:
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0,
with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 8452 is different from 4231)
If I need to provide more information about my code please ask, but I am stuck here and I do not really understand what is causing that error, and any help is welcomed.
This is how I obtained my train-test sets:
total_count = len(tokenised_reviews)
split = int(total_count * 0.8)
shuffle = np.random.permutation(total_count)
x = []
y = []
for i in range(total_count):
x.append(x_data[shuffle[i]])
y.append(y_data[shuffle[i]])
x_train = x[:split]
x_test = x[split:]
y_train = y[:split]
y_test = y[split:]