Naive Bayes is a popular (baseline) method for text-classification.
Questions tagged [naivebayes]
1035 questions
-1
votes
1 answer
Text Blob Naive Bayes classification
I am using textblob lib for classification using naive bayes , I have a train set and wants to check if I pass a word it should check in the train and classify accordingly and if the word is not present in the train it should not suggest any…

Dexter1611
- 492
- 1
- 4
- 15
-1
votes
1 answer
How to convert a CSV file to SVM for ML training
I have this dataset i wish to train multiple ML models on in Apache Spark 2.1.1. It consists of 10 columns, 2 of which contain strings. Removing these columns is not an option as they are vital to the information I wish to gather. However, I am…

Nnamdi Affia
- 11
- 5
-1
votes
1 answer
Text classification for binary output
I am not a data scientist and very new to data science/ machine learning.
My goal is to predict if certain text is of a specific class or not.
I have looked naive bays to classify the text in different classes, but here I have only one class.…

Sagar
- 5,315
- 6
- 37
- 66
-1
votes
1 answer
ROC curve error
I have dataset like this
train <- sample(1:nrow(df), nrow(df)*0.80)
train <- df[train, ]
test <- df[-train, ]
NaiveBayes1 <-naiveBayes(purchased ~ .,data=train)
pre1 <- predict(NaiveBayes1,test,probability = TRUE)
library(pROC)
roc1 <-…

joerna
- 435
- 1
- 6
- 13
-1
votes
1 answer
Does naive bayes text classification require real world data
Given that the Bayesian formula is:
P(A|B) = (P(B|A) * P(A)) / P(B)
Lets say that I want to train a classifier to classify spam/ham. Lets say also, that in the real world, we get about 1% spam. So given a sample input, we would expect about 1%…

user98651
- 304
- 1
- 2
- 13
-1
votes
1 answer
Naive Bayes method
Let's assume that I've got patients with information about their diseases and symptoms. I want to estimate probability of P(diseasei = TRUE|symptomj = TRUE). I suppose that I should use NB classifier, but every example I've found apply Naive Bayes…

ds_fan
- 1
-1
votes
1 answer
Naive Bayes Classification with R - strange result
I have the following problem: I'd like to pedict a factor-variable "cancer" (yes or no) using two variables "sex" and "agegroup" with a bayes classifier.
These are my (fictional) sample…

D. Studer
- 1,711
- 1
- 16
- 35
-1
votes
1 answer
How to pass a dynamic test instance in document classification in java using weka
I am new to weka. Currently I am working on text classification using weka and java. My training data-set has one String attribute and one class attribute.
@RELATION test
@ATTRIBUTE tweet string
@ATTRIBUTE class {positive,negative}
I want to…

dennypanther
- 53
- 1
- 10
-1
votes
1 answer
Low Accuracy in Implementing naive Bayes classifier
I have code for naive Bayes classifier that implement the concept of the naive Bayes, but the accuracy that this algorithm gives me is about 48% and it much lower than MATLAB build-in function for Naive Bayes (84%). Can anybody help me where is the…

Elnaz
- 3
- 1
-1
votes
2 answers
Collecting Machine learning training data
I am very new to machine learning, and need a couple of things clarified. I am trying to predict the probability of someone liking an activity based on their Facebook likes. I am using the Naive Bayes classifier, but am unsure on a couple of things.…

joethemow
- 1,641
- 4
- 24
- 39
-1
votes
1 answer
external dataset learning in python for machine learning
Hi I want classify a dataset using naivebayesclassifier.For that I want to use external dataset which i have downloaded from google.this dataset contains a two folder for positive reviews and negative reviews.Each folder contains 1000 .txt files.How…

Sharmili Nag
- 663
- 2
- 8
- 10
-1
votes
1 answer
Weka, which classifier to use, for two categorical and 10 numerical attributes
I would like to ask, I have 10 columns with sound parameters and after 2 columns with which two instrument was recorded in this moment.
After I have data with 10 columns of sound parameters and I need to predict which single instrument was used.
I…

Denis
- 79
- 2
- 13
-1
votes
1 answer
How do I merge or combine error rates?
Let's say I have a dataset that has 9 continuous columns of data and 4 columns of categorical data. In Matlab, I separate the columns into two groups and do training/testing (naïve bayes) on them separately and determine that the continuous columns…

swabygw
- 813
- 1
- 10
- 22
-1
votes
1 answer
Printing conditional probabilities from Naive Bayes Model in R
I have created a model using e1071 package for Naive Bayes classifier. I need to print the conditional probabilities in below format.
P(C=c1)=0.32 P(A1=x1|c1)=0.33 P(A1=x2|c1)=0.67 P(A2=y1|c1)=0.25 P(A2=y2|c1)=0.75
P(A3=z1|c1)=0.26 P(A3=z2|c1)=0.49…

MrNeilP
- 349
- 1
- 5
- 19
-1
votes
1 answer
Term Frequency and IDF - Clarification
Based on the link , https://en.wikipedia.org/wiki/Tf%E2%80%93idf , IDF is used to negate the weightage of frequently used words in a document ( like "the" , "of" etc )
If I am applying stop words removal before extracting features , should IDF be…

lives
- 1,243
- 5
- 25
- 61