Questions tagged [naivebayes]

Naive Bayes is a popular (baseline) method for text-classification.

1035 questions
4
votes
2 answers

org.apache.spark.sql.AnalysisException: Can't extract value from probability

I am using Naive Bayes algorithms to classify articles, and want to access the “probability” column of part results: val Array(trainingDF, testDF) = rawDataDF.randomSplit(Array(0.6, 0.4)) val ppline = MyUtil.createTrainPpline(rawDataDF) val…
bruce
  • 189
  • 2
  • 6
4
votes
1 answer

e1071 Package: naiveBayes prediction is slow

I am trying to run the naiveBayes classifier from the R package e1071. I am running into an issue where the time it takes to predict takes longer than the time it takes to train, by a factor of ~300. I was wondering if anyone else has observed this…
Tyler R.
  • 461
  • 6
  • 15
4
votes
0 answers

How to Improve Accuracy of Record Linkage Algorithm

I have written a program that looks for probable duplicate records in a list of records. The current version of my program recognized 94 percent of the duplicates found by a manual check of the data, though if possible I need to get that up to 100…
Luca
  • 515
  • 5
  • 17
4
votes
1 answer

Python: Loaded NLTK Classifier not working

I'm trying to train a NLTK classifier for sentiment analysis and then save the classifier using pickle. The freshly trained classifier works fine. However, if I load a saved classifier the classifier will either output 'positive', or 'negative' for…
Exzone
  • 53
  • 4
4
votes
4 answers

What should be taken as m in m estimate of probability in Naive Bayes

What should be taken as m in m estimate of probability in Naive Bayes? So for this example what m value should I take? Can I take it to be 1. Here p=prior probabilities=0.5. So can I take P(a_i|selected)=(n_c+ 0.5)/ (3+1) For Naive Bayes…
clarkson
  • 561
  • 2
  • 16
  • 32
4
votes
1 answer

Why is the output of predict a factor with 0 levels?

I generated a naive Bayes classifier using e1071 and then ran the following test <- as.data.frame(readRDS('ContactsComplaints2014.rds')) load(file='data2015.nb.RData') prediction <- predict(data2015.nb,subset(test[1:3000,],…
wlad
  • 2,073
  • 2
  • 18
  • 29
4
votes
2 answers

Python -- SciKit -- Text Feature Extraction of Classifer

I have to classify articles into my custom categories. So I chose MultinomialNB from SciKit. I am doing supervised learning. So I have an editor who look at the articles daily and then tag them. Once they are tagged I include them into my Learning…
planet260
  • 1,384
  • 1
  • 14
  • 30
4
votes
5 answers

naive bayesian spam filter question

I am planning to implement spam filter using Naive Bayesian classification model. Online I see a lot of info on Naive Bayesian classification, but the problem is its a lot of mathematical stuff, than clearly stating how its done. And the problem is…
Microkernel
  • 1,347
  • 4
  • 17
  • 38
4
votes
2 answers

Model in Naive Bayes

When we train a training set using decision tree classifier, we will get a tree model. And this model can be converted to rules and can be incorporated into a java code. Now if I train the training set using Naive Bayes, in what form is the model?…
3
votes
2 answers

Naive bayesian classifier - multiple decisions

I need to know whether the Naive bayesian classifier can be used to generate multiple decisions. I couldn't find any examples which have any evidence in supporting multiple decisions. I'm new to this area. So, I'm bit confused. Actually I need to…
chandimak
  • 201
  • 3
  • 17
3
votes
1 answer

AttributeError: 'Series' object has no attribute 'to_coo'

I am trying to use a Naive Bayes classifier from the sklearn module to classify whether movie reviews are positive. I am using a bag of words as the features for each review and a large dataset with sentiment scores attached to reviews. df_bows =…
3
votes
1 answer

how to calculate the precision and F1?

i am using TF_IDF for for feature selection and Naive Bayes Classifier. i want to calculate the total accuracy and precision. accuracy = train_model(naive_bayes.MultinomialNB(), xtrain_count, train_y, xvalid_count) print("NB, Count Vectors: ",…
Irfan Yaqub
  • 402
  • 4
  • 14
3
votes
1 answer

Why does training my Naive Bayes Classifier take so much memory?

Recently, I have been working on a project which requires Sentiment analysis of twitter data. I am using a Naive Bayes Classifier from the Textblob library, and am trying to train it with 1.6 million tweets (which can be found here if anyone is…
3
votes
0 answers

Naive Bayes classification of a 20 different kind of messages can only predict three different class of messages

I was trying to implement a naïve Bayes model to classify messages. The result of my codes has only three different predicted classes, while there are 20 classes in the training examples. I'm wondering if I did it correctly? Can you see if I did…
user42493
  • 813
  • 4
  • 14
  • 34
3
votes
1 answer

How to calculate probability from probability density function in the Naive Bayes Classifier?

I am implementing Gaussian Naive Bayes Algorithm: # importing modules import pandas as pd import numpy as np # create an empty dataframe data = pd.DataFrame() # create our target variable data["gender"] = ["male","male","male","male", …