0

I'm making machine learning for twitter sentiment analysis using Naive Bayes and tf-idf. My initial dataset is 500 data with total of 1500 features. After using K-fold(10-fold) I was able to get accuracy of 70%.

Now that that's done, I want my machine to be able to predict new data, preferably new dataset, from it's past experience. So my idea is something like this

#Base Model
X_train = df_tfidf.drop(['Sentimen'], axis=1)
y_train = df_tfidf.Sentimen

#Data testing
X_test = df_tfidf_tes.drop(['Sentimen'], axis=1)
y_test = df_tfidf_tes['Sentimen']

model = MultinomialNB()

model.fit(X_train,y_train)

prediction = model.predict(X_test)
prediction

But then it return this error shapes (5,9) and (1534,2) not aligned: 9 (dim 1) != 1534 (dim 0)

Sure enough, my new dataset is small with only 5 data and 9 features, and according to this answer in stackexchange, it's not possible to predict 2 dataset using 1 model.

But doesn't that defeat the purpose of machine learning, no? Or am I aproaching things wrong here?

Abbi KRK
  • 53
  • 10
  • The fact that the your sets are of different shapes does not mean that machine learning in general does not work. Assuming this seems like quite the leap. – tripleee Dec 13 '21 at 06:17
  • @tripleee Well, I was asking a way to do it, or some method that I need to use to achieve the question I ask. Sorry if I sounds like I'm assuiming things – Abbi KRK Dec 13 '21 at 12:36

0 Answers0