I'm making machine learning for twitter sentiment analysis using Naive Bayes and tf-idf. My initial dataset is 500 data with total of 1500 features. After using K-fold(10-fold) I was able to get accuracy of 70%.
Now that that's done, I want my machine to be able to predict new data, preferably new dataset, from it's past experience. So my idea is something like this
#Base Model
X_train = df_tfidf.drop(['Sentimen'], axis=1)
y_train = df_tfidf.Sentimen
#Data testing
X_test = df_tfidf_tes.drop(['Sentimen'], axis=1)
y_test = df_tfidf_tes['Sentimen']
model = MultinomialNB()
model.fit(X_train,y_train)
prediction = model.predict(X_test)
prediction
But then it return this error
shapes (5,9) and (1534,2) not aligned: 9 (dim 1) != 1534 (dim 0)
Sure enough, my new dataset is small with only 5 data and 9 features, and according to this answer in stackexchange, it's not possible to predict 2 dataset using 1 model.
But doesn't that defeat the purpose of machine learning, no? Or am I aproaching things wrong here?