I am classifying documents as positive and negative labels using Naive Bayes model. It seems working fine for small balanced dataset size around 72 documents. But when I add more negative labeled documents, the classifier is predicting everything as negative.
I am splitting my dataset into 80% training and 20% test set. Adding more negatively labeled documents definitely makes the dataset skewed. Could it be the skewness that makes the classifier predict every test document as negative? I am using TextBlob/nltk implementation of Navive Bayes modle.
Any idea?