Questions tagged [text-classification]

Simply stating, text classification is all about putting a piece of text into a set of (mostly predefined) categories. This is one of the most important problems which occurs in many real world applications. For example one example of text classification would be an automated call centre which would like to categorise the complaints automatically into the most appropriate bucket of problems.

Text classification is a sub-problem of a more general problem of classification. In this application, the input is represented with a piece of text (rather than images, sounds, videos etc). The output could be:

  • binary (binary classification)
  • one category out of k possible categories (multi-class)
  • a set of categories out of k possible categories (multi-label).

In text classification, the feature extracted from the text are usually sparse (instead of dense, like in image classification).

1694 questions
0
votes
1 answer

Can we combine base line Naive Bayes, Multinomial Naive Bayes and Semi-supervised NB?

I am working on sentiment analysis on twitter data. I have tried with couple of Naive Bayes models like Baseline Naive Bayes, Multinomial NB, Bernoulli NB, Semi-supervised NB. My question here is to understand if there is a way we can combine the…
0
votes
2 answers

Op type not registered HashTableV2 in Tensorflow 1.4.1 while deploying in Cloud ML

When we deploying the model to cloud ml we are getting Bad model Op type not registered HashTableV2 Code: def model_fn(features, labels, mode): if mode == tf.estimator.ModeKeys.TRAIN: tf.keras.backend.set_learning_phase(True) else: …
0
votes
1 answer

text classification using logistic regression

I am planning to classify emails. I am using tfidf vectorizer and logistic regression algorithm to do this. I took very small training and testing sets. My training set consists of 150 emails( 3 classes, 50 emails/class) and testing set consists of…
0
votes
0 answers

R Match predefined topics to texts

I have some topics and a lot of texts. I need to match the topics to the texts. So I was asking if there is a package in R which can do so. Here´s some sample data: texts = c("I like my pet. It is a dog which has brown hair. He likes to play with…
0
votes
1 answer

Tensorboard Embedding Visualization

I am working on a text classification problem. I have 3M+ rows which need to be classified into 20 categories. Following are two code snippets from my entire code: This is the code where my tf variables are defined.: class TextCNNRNN(object): …
0
votes
1 answer

Can I predict on continues target value in keras?

This is the example of data that I have. Length of df is 1778360. The search term is the queries that people type on Search Engine. CR (Conversion Rate) is a continuous number. It starts from 0 to no limit. Search term …
0
votes
2 answers

For text classification with scikit-learn, do I have to use both, Countvectorizer and TFIDF?

Looking through scikit-learn documentation code, it suggests to implement the Countvectorizer first and then on top TFIDF. Can I use only TFIDF? http://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html enter image…
0
votes
2 answers

Multiclass text classification with python and nltk

I am given a task of classifying a given news text data into one of the following 5 categories - Business, Sports, Entertainment, Tech and Politics About the data I am using: Consists of text data labeled as one of the 5 types of news statement…
0
votes
1 answer

add custom dataset into fasttext classification deep learning

Based on this guithub link https://github.com/brightmart/text_classification, I want to running "fasttext" classification but there are some files that I couldn't find them so I want to add my custom dataset on it as an input and after that run…
brelian
  • 403
  • 2
  • 15
  • 31
0
votes
1 answer

How to interpret scored probabilities in machine learning classification algorithm?

I am using two Neural networks for two class text classification. I'm getting 90% accuracy on test data. Also using different performance metrics like precision, recall, f-score and confusion matrix to make sure that model is performing as…
0
votes
1 answer

Convert textual document to tf.data in tensorflow for reading sequentially

In a textual corpus, there are 50 textual documents that each document approximately is about 80 lines. I want to feed my corpus as an input to tensorflow, but I want to batch each document when system read each document? actually same as TfRecord…
0
votes
0 answers

R stops working/terminate when execute maximum entropy program

I need your help because I don't understand why my RStudio terminater/stop working when execude maimum entropy to text classification. I using tweet data 7877 row. There is the code…
0
votes
1 answer

TensorFlow - Understanding filter and stride shapes for CNN text classification

I am reviewing Denny Britz's tutorial on text classification using CNNs in TensorFlow. Filter and stride shapes make perfect sense in the image domain. However, when it comes to text, I am confused on how to correctly define the stride and filter…
Brian
  • 7,098
  • 15
  • 56
  • 73
0
votes
2 answers

UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 11597: ordinal not in range(128)

I am using well known 20 Newsgroups data set for text categorisation in jupyter. When I try to open and read file on my Mac, it fails at decoding step. I tried to read the file in byte format, which works but I further need to work with it as…
0
votes
1 answer

sentiment analysis and classification of the user based on this tweets?best approach for classifying the users(positive or negative) based on tweets?

i am working on the classification(positive/negative) of the followers of a twitter account based on the followers tweets , collecting data got all the followers and the their tweets from the respective account sentiment analysis of each tweet and…