Questions tagged [naivebayes]

Naive Bayes is a popular (baseline) method for text-classification.

1035 questions
7
votes
1 answer

Why does Naive Bayes fail to solve XOR

I have recently started understanding algorithms related to natural language processing, and have come across various sites which indicate that Naive Bayes cannot capture the XOR concept. Firstly I do not understand what exactly is the XOR problem.…
qualitytest
  • 763
  • 1
  • 10
  • 18
7
votes
4 answers

Naive Bayesian spam filtering effectiveness

How effective is naive Bayesian filtering for filtering spam? I heard that spammers easily bypass them by stuffing extra non-spam-related words. What programming techniques can you use with Bayesian filters to prevent that?
Waleed Eissa
  • 10,283
  • 16
  • 60
  • 82
7
votes
1 answer

Get a classification report stating the class wise precision and recall for multinomial Naive Bayes using 10 fold cross validation

I have the following piece of code which uses an NB classifier for a multi class classification problem. The function preforms cross validation by storing the accuracies and printing the average later. What I instead want is a classification report…
CuriousCat
  • 125
  • 1
  • 6
7
votes
1 answer

Naive Bayes: the within-class variance in each feature of TRAINING must be positive

When trying to fit Naive Bayes: training_data = sample; % target_class = K8; # train model nb = NaiveBayes.fit(training_data, target_class); # prediction y = nb.predict(cluster3); I get an error: ??? Error using ==>…
G Gr
  • 6,030
  • 20
  • 91
  • 184
6
votes
1 answer

Multinomial Naive Bayes Classifier

I have been looking for a multinomial naive Bayes classifier on CRAN, and so far all I can come up with is the binomial implementation in package e1071. Does anyone know of a package that has a multinomial Bayes classifier?
Timothy P. Jurka
  • 918
  • 1
  • 11
  • 21
6
votes
3 answers

After training my own classifier with nltk, how do I load it in textblob?

The built-in classifier in textblob is pretty dumb. It's trained on movie reviews, so I created a huge set of examples in my context (57,000 stories, categorized as positive or negative) and then trained it using nltk. I tried using textblob to…
Marc Maxmeister
  • 4,191
  • 4
  • 40
  • 54
6
votes
1 answer

ValueError: operands could not be broadcast together with shapes in Naive bayes classifier

Getting straight to the point: 1) My goal was to apply NLP and Machine learning algorithm to classify a dataset containing sentences into 5 different types of categories(numeric). For e.g. "I want to know details of my order -> 1". Code: import…
6
votes
1 answer

dimension mismatch error in CountVectorizer MultinomialNB

Before I lodge this question, I have to say I've thoroughly read more than 15 similar topics on this board, each with somehow different recommendations, but all of them just could not get me right. Ok, so I split my 'spam email' text data…
Chris T.
  • 1,699
  • 7
  • 23
  • 45
6
votes
1 answer

Simple text classification using naive bayes (weka) in java

I try to do text classification naive bayes weka libarary in my java code, but i think the result of the classification is not correct, i don't know what's the problem. I use arff file for the input. this is my training data: @relation…
6
votes
1 answer

How to find which columns affect a prediction in R

Say, I am working on a machine learning model in R using naive bayes. So I would build a model using the naiveBayes package as follows model <- naiveBayes(Class ~ ., data = HouseVotes84) I can also print out the weights of the model by just…
saltandwater
  • 791
  • 2
  • 9
  • 25
6
votes
1 answer

How to train a naive bayes classifier with pos-tag sequence as a feature?

I have two classes of sentences. Each has reasonably distinct pos-tag sequence. How can I train a Naive-Bayes classifier with POS-Tag sequence as a feature? Does Stanford CoreNLP/NLTK (Java or Python) provide any method for building a classifier…
6
votes
3 answers

How to train large Dataset for classification

I have a training dataset of 1600000 tweets. How can I train this type of huge data. I have tried something using nltk.NaiveBayesClassifier. It will take more than 5 days to train if I run. def extract_features(tweet): tweet_words =…
Shahriar
  • 13,460
  • 8
  • 78
  • 95
6
votes
1 answer

Semi-supervised Naive Bayes with NLTK

I have built a semi-supervised version of NLTK's Naive Bayes in Python based on the EM (expectation-maximization algorithm). However, in some iterations of EM I am getting negative log-likelihoods (the log-likelihoods of EM must be positive in every…
6
votes
2 answers

Alternatives to Naive Bayes algorithm

We're trying to implement a semantic searching algorithm to give suggested categories based on a user's search terms. At the moment we have implemented the Naive Bayes probabilistic algorithm to return the probabilities of each category in our data…
Fogmeister
  • 76,236
  • 42
  • 207
  • 306
5
votes
2 answers

class_weight = 'balanced' equivalent for naive bayes

I'm performing some (binary)text classification with two different classifiers on the same unbalanced data. i want to compare the results of the two classifiers. When using sklearns logistic regression, I have the option of setting the class_weight…
dumbchild
  • 275
  • 4
  • 11
1 2
3
68 69