Questions tagged [naivebayes]

Naive Bayes is a popular (baseline) method for text-classification.

1035 questions
3
votes
1 answer

label_binarize Does not fit for sklearn Naive Bayes classifier showing bad input shape

I was trying to create roc curve for multiclass using Naive Bayes But it ending with ValueError: bad input shape. import numpy as np import matplotlib.pyplot as plt from itertools import cycle from sklearn import svm, datasets from…
3
votes
1 answer

difference between coef_ and feature_log_prob_ in multinomial naive bayes?

The below code represnets sklearn multinomial naive bayes. import numpy as np from sklearn.naive_bayes import MultinomialNB X = np.random.randint(5, size=(10, 100)) y=np.random.randint(2,size=(10,)) clf = MultinomialNB() clf.fit(X, y) Then I want…
user10360186
3
votes
0 answers

Using NLTK/NaiveBayesClassifier, always getting 'Positive' output on test data

I have recently begun trying to get into Machine Learning and was following a tutorial that created a model that could determine if an inputted tweet was positive or negative. The program worked decently well, but it wasn't very accurate and I…
Tom D
  • 65
  • 1
  • 1
  • 4
3
votes
1 answer

How to solve ambiguity in sentiment analysis?

I'm quite new to text mining and I'm challenging my self to do the sentiment analysis today. But I encounter some problems while doing the sentiment analysis. In my language, a word can have some different meanings. Like "setan" means : 1) devils 2)…
3
votes
1 answer

Multiclass Classification and probability prediction

import pandas as pd import numpy from sklearn import cross_validation from sklearn.naive_bayes import GaussianNB fi = "df.csv" # Open the file for reading and read in data file_handler = open(fi, "r") data = pd.read_csv(file_handler,…
3
votes
0 answers

Issues with Naive Bayes Text Classification with two Categories in R

I'm trying to implement a Naive Bayes classifier on a data set which contains text data in the form of complaints from customers (Complaint) and Reddit comments (General_Text). The whole set has 250'000 Texts for each category. However, I use only…
3
votes
2 answers

How can I use word2vec to train a classifier?

The code is used to generate word2vec and use it to train the naive Bayes classifier. I am able to generate word2vec and use the similarity functions successfully.As a next step I would want to use the word2vec to train the naive bayes classifier.…
Raj
  • 31
  • 1
  • 1
  • 2
3
votes
1 answer

How to create Naive Bayes in R for numerical and categorical variables

I am trying to implement a Naive Bayes model in R based on known information: Age group, e.g. "18-24" and "25-34", etc. Gender, "male" and "female" Region, "London" and "Wales", etc. Income, "£10,000 - £15,000", etc. Job, "Full Time" and "Part…
3
votes
2 answers

How to predict after training data using naive bayes with python?

I have got a dataset which contains just two useful columns for training my model, first is news heading and the second is category of news. So, I got the following training command running successfully using python: import re import numpy as…
3
votes
1 answer

ValuError encounted in SMOTE imblearn.over_sampling

I have been trying to oversample my dataset since it is not balanced. I am doing a binary text classification and would like to keep a ratio of 1 between both my classes. I am trying the SMOTE mechanism to solve the problem. I followed this…
Ankur Sinha
  • 6,473
  • 7
  • 42
  • 73
3
votes
0 answers

How to plot the decision boundary for a Gaussian Naive Bayes classifier?

I use the toy dataset (class membership variable & 2 features) below to apply a Gaussian Naive Bayes model and plot the contours of the class-specific bivariate normal distributions. How to add a line for the decision boundary to the plot…
majom
  • 7,863
  • 7
  • 55
  • 88
3
votes
1 answer

I need to improve Naive Bayes text classification accuracy

I'm using Ruby to implement Naive Bayes. I need to classify a text into a category (I have 4 different categories). I tried to optimize it in several ways, but none seems to work. I removed the "Stopwords", did Stemmer in the words, parameterized,…
3
votes
2 answers

Quanteda package, Naive Bayes: How can I predict on different-featured test data?

I used quanteda::textmodel_NB to create a model that categorizes text into one of two categories. I fit the model on a training data set of data from last summer. Now, I am trying to use it this summer to categorize new text we get here at work. I…
Mark White
  • 1,228
  • 2
  • 10
  • 25
3
votes
1 answer

How can we use TFIDF vectors with multinomial naive bayes?

Say we have used the TFIDF transform to encode documents into continuous-valued features. How would we now use this as input to a Naive Bayes classifier? Bernoulli naive-bayes is out, because our features aren't binary anymore. Seems like we can't…
dhrumeel
  • 574
  • 1
  • 6
  • 15
3
votes
1 answer

R naiveBayes classifier predict "subscript out of bounds"

I am new to R, and am trying to solve why my predict is out of bounds. the question should be an easy fix as this is more of an introduction. set my classifier with train data sms_classifier <- naiveBayes(sms_train, sms_train_labels) but error…