Questions tagged [naivebayes]

Naive Bayes is a popular (baseline) method for text-classification.

1035 questions
-1
votes
2 answers

How to Approach Creating an Accurate Multiclass Multinomial Naive Bayes with Unbalanced Data

I have used sklearn to create a basic multiclass naive bayes text classifier. I have 3 classes and around 800 rows of data. Class A has 564 rows, Class B has 159, and Class C has 82. As you can see the data is unbalanced among the classes and I…
ssal
  • 95
  • 1
  • 1
  • 8
-1
votes
1 answer

Preparing data for Multinomial Naive Bayes and running algorithm - ValueError: Found input variables with inconsistent numbers of samples

I want to run Multinomial NB algorithm to predict how many thumbs up will comment get on Google Play. Data are scraped from offline navigation applications reviews. I tried to run the algorithms using lists but it doesn't help, input must be…
-1
votes
1 answer

sklearn: Found arrays with inconsistent numbers of samples when calling naive_bayes.MultinomialNB(

I have looked at similar questions as such as this one. But none of the mentioned solutions worked in my case. I am trying to build a text classification prediction model. def train_model(classifier, feature_vector_train, label,…
leena
  • 563
  • 1
  • 8
  • 25
-1
votes
1 answer

Avoid getting nonsensical values (eg. -0.0) when classifying very long texts with Naive Bayes

I've implemented a fairly efficient implementation of a multinomial Naive Bayes classifier and it works like a charm. That is until the classifier encounters very long messages (on the order of 10k words) where the predictions results are…
Nocturn9X
  • 25
  • 7
-1
votes
2 answers

get close words of a dictionary (database) when you misspell a word using Naive Bayes algorithm

I would like to use Naive Bayes for text classification to get close words of a dictionary (database) when the user misspell a word. For example: the user enters "sheese" the ouput would be "cheese". Please how can I use it? knowing that my…
-1
votes
1 answer

how to use MultinomialNB model with live data?

I am new to machine Learning. I was trying to develop a sentimental analysis application for a CRM project, My program model predict follow-up status when user enters follow-up comment. For creating model, I made the program read old comments and…
RatheeshTS
  • 411
  • 3
  • 15
-1
votes
1 answer

Naive Bayes classifier for movie reviews has very low accuracy, despite several attempts of feature selection

I am fairly new to machine learning and have been tasked with building a machine learning modelt o predict whether a review is good (1) or bad (0). I have already tried to use a RandomForestClassifier that output an accuracy of 50%. I switched to…
geds133
  • 1,503
  • 5
  • 20
  • 52
-1
votes
1 answer

What is a classifier in Python (Gaussian Naive Bayes)?

okay so when I use the following code, what exactly does that "clf" part mean? is that a variable? I know that's a classifier but is classifier a function in python or it's just a variable named that way or what exactly? I am new to python and…
-1
votes
1 answer

How to test new word set against my NLP Naive bayes classifier

i build a NLP classifier based on Naive base using python scikit-learn the point is that , I want my classifier to classify a new text " that is not belongs to any of my training or testing data set" in another model"like regression" , I can…
-1
votes
1 answer

Tunning treshold in Binarizer to BernoulliNB() classifier

I want to use the BernoulliNB() classifier, and my data is not binarized. So I want to choose the best binarization threshold by GridsearchCV(). My code looks like: from sklearn.pipeline import Pipeline from sklearn.naive_bayes import…
KejBi
  • 9
  • 2
-1
votes
2 answers

naive bayes accuracy increasing as increasing in the alpha value

I'm using naive Bayes for text classification and I have 100k records in which 88k are positive class records and 12krecords are negative class records. I converted sentences to unigrams and bigrams using countvectorizer and I took alpha range from…
-1
votes
3 answers

Text Preprocessing for classification - Machine Learning

what are important steps for preprocess our Twitter texts to classify between binary classes. what I did is that I removed hashtag and keep it without hashtag, I also used some regular expression to remove special char, these are two function I…
user9913522
-1
votes
2 answers

python string: remove all zeros and one value before each zero in a string

I have a naive python question; I need to remove "0 and a value previous to it (any value) from a string "s": s = "8 9243 8607 56 586262 4697424 00000 ik 0 C S<->L G 0 G 0 G 0 P …
user3698773
  • 929
  • 2
  • 8
  • 15
-1
votes
1 answer

Why my classifier is not working well for new data (data which has not taken as dataset)?

My dataset has copd documents as positive data(86) and malaria(20) + diarreha(20) + elephantiasis(20) as negative data.So total documents in my dataset is 146 where 86 as positive and 60 as negative.I have taken ratio of training: testing is…
-1
votes
1 answer

why naive bayes requires balanced training data?

I created a word sentiment app using the Naive Bayes algorithm. There are two types of criteria in this classification training data, that is positive training data and negative training data. I take a unique word on every training data that has…