Naive Bayes is a popular (baseline) method for text-classification.
Questions tagged [naivebayes]
1035 questions
-1
votes
2 answers
How to Approach Creating an Accurate Multiclass Multinomial Naive Bayes with Unbalanced Data
I have used sklearn to create a basic multiclass naive bayes text classifier. I have 3 classes and around 800 rows of data. Class A has 564 rows, Class B has 159, and Class C has 82. As you can see the data is unbalanced among the classes and I…

ssal
- 95
- 1
- 1
- 8
-1
votes
1 answer
Preparing data for Multinomial Naive Bayes and running algorithm - ValueError: Found input variables with inconsistent numbers of samples
I want to run Multinomial NB algorithm to predict how many thumbs up will comment get on Google Play.
Data are scraped from offline navigation applications reviews.
I tried to run the algorithms using lists but it doesn't help, input must be…

Nikola Greb
- 49
- 5
-1
votes
1 answer
sklearn: Found arrays with inconsistent numbers of samples when calling naive_bayes.MultinomialNB(
I have looked at similar questions as such as this one. But none of the mentioned solutions worked in my case.
I am trying to build a text classification prediction model.
def train_model(classifier, feature_vector_train, label,…

leena
- 563
- 1
- 8
- 25
-1
votes
1 answer
Avoid getting nonsensical values (eg. -0.0) when classifying very long texts with Naive Bayes
I've implemented a fairly efficient implementation of a multinomial Naive Bayes classifier and it works like a charm. That is until the classifier encounters very long messages (on the order of 10k words) where the predictions results are…

Nocturn9X
- 25
- 7
-1
votes
2 answers
get close words of a dictionary (database) when you misspell a word using Naive Bayes algorithm
I would like to use Naive Bayes for text classification to get close words of a dictionary (database) when the user misspell a word. For example: the user enters "sheese" the ouput would be "cheese".
Please how can I use it? knowing that my…

littleX
- 15
- 4
-1
votes
1 answer
how to use MultinomialNB model with live data?
I am new to machine Learning. I was trying to develop a sentimental analysis application for a CRM project, My program model predict follow-up status when user enters follow-up comment. For creating model, I made the program read old comments and…

RatheeshTS
- 411
- 3
- 15
-1
votes
1 answer
Naive Bayes classifier for movie reviews has very low accuracy, despite several attempts of feature selection
I am fairly new to machine learning and have been tasked with building a machine learning modelt o predict whether a review is good (1) or bad (0). I have already tried to use a RandomForestClassifier that output an accuracy of 50%. I switched to…

geds133
- 1,503
- 5
- 20
- 52
-1
votes
1 answer
What is a classifier in Python (Gaussian Naive Bayes)?
okay so when I use the following code, what exactly does that "clf" part mean? is that a variable? I know that's a classifier but is classifier a function in python or it's just a variable named that way or what exactly? I am new to python and…

bsikriwal
- 178
- 3
- 15
-1
votes
1 answer
How to test new word set against my NLP Naive bayes classifier
i build a NLP classifier based on Naive base using python scikit-learn
the point is that , I want my classifier to classify a new text " that is not belongs to any of my training or testing data set"
in another model"like regression" , I can…

Finn The Human
- 1
- 3
-1
votes
1 answer
Tunning treshold in Binarizer to BernoulliNB() classifier
I want to use the BernoulliNB() classifier, and my data is not binarized. So I want to choose the best binarization threshold by GridsearchCV().
My code looks like:
from sklearn.pipeline import Pipeline
from sklearn.naive_bayes import…

KejBi
- 9
- 2
-1
votes
2 answers
naive bayes accuracy increasing as increasing in the alpha value
I'm using naive Bayes for text classification and I have 100k records in which 88k are positive class records and 12krecords are negative class records. I converted sentences to unigrams and bigrams using countvectorizer and I took alpha range from…

Ravi
- 2,778
- 2
- 20
- 32
-1
votes
3 answers
Text Preprocessing for classification - Machine Learning
what are important steps for preprocess our Twitter texts to classify between binary classes. what I did is that I removed hashtag and keep it without hashtag, I also used some regular expression to remove special char, these are two function I…
user9913522
-1
votes
2 answers
python string: remove all zeros and one value before each zero in a string
I have a naive python question; I need to remove "0 and a value previous to it (any value) from a string "s":
s = "8 9243 8607 56 586262 4697424 00000 ik 0 C S<->L G 0 G 0 G 0 P …

user3698773
- 929
- 2
- 8
- 15
-1
votes
1 answer
Why my classifier is not working well for new data (data which has not taken as dataset)?
My dataset has copd documents as positive data(86) and malaria(20) + diarreha(20) + elephantiasis(20) as negative data.So total documents in my dataset is 146 where 86 as positive and 60 as negative.I have taken ratio of training: testing is…

Tanushree Tanu
- 27
- 1
-1
votes
1 answer
why naive bayes requires balanced training data?
I created a word sentiment app using the Naive Bayes algorithm.
There are two types of criteria in this classification training data, that is positive training data and negative training data. I take a unique word on every training data that has…

Gilang Pratama
- 439
- 6
- 18