Naive Bayes is a popular (baseline) method for text-classification.
Questions tagged [naivebayes]
1035 questions
4
votes
2 answers
org.apache.spark.sql.AnalysisException: Can't extract value from probability
I am using Naive Bayes algorithms to classify articles, and want to access the “probability” column of part results:
val Array(trainingDF, testDF) = rawDataDF.randomSplit(Array(0.6, 0.4))
val ppline = MyUtil.createTrainPpline(rawDataDF)
val…

bruce
- 189
- 2
- 6
4
votes
1 answer
e1071 Package: naiveBayes prediction is slow
I am trying to run the naiveBayes classifier from the R package e1071. I am running into an issue where the time it takes to predict takes longer than the time it takes to train, by a factor of ~300.
I was wondering if anyone else has observed this…

Tyler R.
- 461
- 6
- 15
4
votes
0 answers
How to Improve Accuracy of Record Linkage Algorithm
I have written a program that looks for probable duplicate records in a list of records. The current version of my program recognized 94 percent of the duplicates found by a manual check of the data, though if possible I need to get that up to 100…

Luca
- 515
- 5
- 17
4
votes
1 answer
Python: Loaded NLTK Classifier not working
I'm trying to train a NLTK classifier for sentiment analysis and then save the classifier using pickle.
The freshly trained classifier works fine. However, if I load a saved classifier the classifier will either output 'positive', or 'negative' for…

Exzone
- 53
- 4
4
votes
4 answers
What should be taken as m in m estimate of probability in Naive Bayes
What should be taken as m in m estimate of probability in Naive Bayes?
So for this example
what m value should I take? Can I take it to be 1.
Here p=prior probabilities=0.5.
So can I take P(a_i|selected)=(n_c+ 0.5)/ (3+1)
For Naive Bayes…

clarkson
- 561
- 2
- 16
- 32
4
votes
1 answer
Why is the output of predict a factor with 0 levels?
I generated a naive Bayes classifier using e1071 and then ran the following
test <- as.data.frame(readRDS('ContactsComplaints2014.rds'))
load(file='data2015.nb.RData')
prediction <- predict(data2015.nb,subset(test[1:3000,],…

wlad
- 2,073
- 2
- 18
- 29
4
votes
2 answers
Python -- SciKit -- Text Feature Extraction of Classifer
I have to classify articles into my custom categories. So I chose MultinomialNB from SciKit. I am doing supervised learning. So I have an editor who look at the articles daily and then tag them. Once they are tagged I include them into my Learning…

planet260
- 1,384
- 1
- 14
- 30
4
votes
5 answers
naive bayesian spam filter question
I am planning to implement spam filter using Naive Bayesian classification model.
Online I see a lot of info on Naive Bayesian classification, but the problem is its a lot of mathematical stuff, than clearly stating how its done. And the problem is…

Microkernel
- 1,347
- 4
- 17
- 38
4
votes
2 answers
Model in Naive Bayes
When we train a training set using decision tree classifier, we will get a tree model. And this model can be converted to rules and can be incorporated into a java code.
Now if I train the training set using Naive Bayes, in what form is the model?…

user1221330
- 93
- 2
- 8
3
votes
2 answers
Naive bayesian classifier - multiple decisions
I need to know whether the Naive bayesian classifier
can be used to generate multiple decisions. I couldn't
find any examples which have any evidence in supporting
multiple decisions. I'm new to this area. So, I'm bit
confused.
Actually I need to…

chandimak
- 201
- 3
- 17
3
votes
1 answer
AttributeError: 'Series' object has no attribute 'to_coo'
I am trying to use a Naive Bayes classifier from the sklearn module to classify whether movie reviews are positive. I am using a bag of words as the features for each review and a large dataset with sentiment scores attached to reviews.
df_bows =…

Luke Turvey
- 31
- 3
3
votes
1 answer
how to calculate the precision and F1?
i am using TF_IDF for for feature selection and Naive Bayes Classifier. i want to calculate the total
accuracy and precision.
accuracy = train_model(naive_bayes.MultinomialNB(), xtrain_count, train_y, xvalid_count)
print("NB, Count Vectors: ",…

Irfan Yaqub
- 402
- 4
- 14
3
votes
1 answer
Why does training my Naive Bayes Classifier take so much memory?
Recently, I have been working on a project which requires Sentiment analysis of twitter data. I am using a Naive Bayes Classifier from the Textblob library, and am trying to train it with 1.6 million tweets (which can be found here if anyone is…

KoderKaran
- 31
- 1
3
votes
0 answers
Naive Bayes classification of a 20 different kind of messages can only predict three different class of messages
I was trying to implement a naïve Bayes model to classify messages. The result of my codes has only three different predicted classes, while there are 20 classes in the training examples. I'm wondering if I did it correctly? Can you see if I did…

user42493
- 813
- 4
- 14
- 34
3
votes
1 answer
How to calculate probability from probability density function in the Naive Bayes Classifier?
I am implementing Gaussian Naive Bayes Algorithm:
# importing modules
import pandas as pd
import numpy as np
# create an empty dataframe
data = pd.DataFrame()
# create our target variable
data["gender"] = ["male","male","male","male",
…

Anwesa Roy
- 33
- 8