Naive Bayes is a popular (baseline) method for text-classification.
Questions tagged [naivebayes]
1035 questions
7
votes
1 answer
Why does Naive Bayes fail to solve XOR
I have recently started understanding algorithms related to natural language processing, and have come across various sites which indicate that Naive Bayes cannot capture the XOR concept. Firstly I do not understand what exactly is the XOR problem.…

qualitytest
- 763
- 1
- 10
- 18
7
votes
4 answers
Naive Bayesian spam filtering effectiveness
How effective is naive Bayesian filtering for filtering spam?
I heard that spammers easily bypass them by stuffing extra non-spam-related words. What programming techniques can you use with Bayesian filters to prevent that?

Waleed Eissa
- 10,283
- 16
- 60
- 82
7
votes
1 answer
Get a classification report stating the class wise precision and recall for multinomial Naive Bayes using 10 fold cross validation
I have the following piece of code which uses an NB classifier for a multi class classification problem. The function preforms cross validation by storing the accuracies and printing the average later. What I instead want is a classification report…

CuriousCat
- 125
- 1
- 6
7
votes
1 answer
Naive Bayes: the within-class variance in each feature of TRAINING must be positive
When trying to fit Naive Bayes:
training_data = sample; %
target_class = K8;
# train model
nb = NaiveBayes.fit(training_data, target_class);
# prediction
y = nb.predict(cluster3);
I get an error:
??? Error using ==>…

G Gr
- 6,030
- 20
- 91
- 184
6
votes
1 answer
Multinomial Naive Bayes Classifier
I have been looking for a multinomial naive Bayes classifier on CRAN, and so far all I can come up with is the binomial implementation in package e1071. Does anyone know of a package that has a multinomial Bayes classifier?

Timothy P. Jurka
- 918
- 1
- 11
- 21
6
votes
3 answers
After training my own classifier with nltk, how do I load it in textblob?
The built-in classifier in textblob is pretty dumb. It's trained on movie reviews, so I created a huge set of examples in my context (57,000 stories, categorized as positive or negative) and then trained it using nltk. I tried using textblob to…

Marc Maxmeister
- 4,191
- 4
- 40
- 54
6
votes
1 answer
ValueError: operands could not be broadcast together with shapes in Naive bayes classifier
Getting straight to the point:
1) My goal was to apply NLP and Machine learning algorithm to classify a dataset containing sentences into 5 different types of categories(numeric). For e.g. "I want to know details of my order -> 1".
Code:
import…

Shikhar Thapliyal
- 602
- 7
- 15
6
votes
1 answer
dimension mismatch error in CountVectorizer MultinomialNB
Before I lodge this question, I have to say I've thoroughly read more than 15 similar topics on this board, each with somehow different recommendations, but all of them just could not get me right.
Ok, so I split my 'spam email' text data…

Chris T.
- 1,699
- 7
- 23
- 45
6
votes
1 answer
Simple text classification using naive bayes (weka) in java
I try to do text classification naive bayes weka libarary in my java code, but i think the result of the classification is not correct, i don't know what's the problem. I use arff file for the input.
this is my training data:
@relation…

Muhammad Haryadi Futra
- 253
- 1
- 5
- 11
6
votes
1 answer
How to find which columns affect a prediction in R
Say, I am working on a machine learning model in R using naive bayes. So I would build a model using the naiveBayes package as follows
model <- naiveBayes(Class ~ ., data = HouseVotes84)
I can also print out the weights of the model by just…

saltandwater
- 791
- 2
- 9
- 25
6
votes
1 answer
How to train a naive bayes classifier with pos-tag sequence as a feature?
I have two classes of sentences. Each has reasonably distinct pos-tag sequence. How can I train a Naive-Bayes classifier with POS-Tag sequence as a feature? Does Stanford CoreNLP/NLTK (Java or Python) provide any method for building a classifier…

kundan
- 1,278
- 14
- 27
6
votes
3 answers
How to train large Dataset for classification
I have a training dataset of 1600000 tweets. How can I train this type of huge data.
I have tried something using nltk.NaiveBayesClassifier. It will take more than 5 days to train if I run.
def extract_features(tweet):
tweet_words =…

Shahriar
- 13,460
- 8
- 78
- 95
6
votes
1 answer
Semi-supervised Naive Bayes with NLTK
I have built a semi-supervised version of NLTK's Naive Bayes in Python based on the EM (expectation-maximization algorithm). However, in some iterations of EM I am getting negative log-likelihoods (the log-likelihoods of EM must be positive in every…

SUP
- 63
- 5
6
votes
2 answers
Alternatives to Naive Bayes algorithm
We're trying to implement a semantic searching algorithm to give suggested categories based on a user's search terms.
At the moment we have implemented the Naive Bayes probabilistic algorithm to return the probabilities of each category in our data…

Fogmeister
- 76,236
- 42
- 207
- 306
5
votes
2 answers
class_weight = 'balanced' equivalent for naive bayes
I'm performing some (binary)text classification with two different classifiers on the same unbalanced data. i want to compare the results of the two classifiers.
When using sklearns logistic regression, I have the option of setting the class_weight…

dumbchild
- 275
- 4
- 11