Questions tagged [text-classification]

Simply stating, text classification is all about putting a piece of text into a set of (mostly predefined) categories. This is one of the most important problems which occurs in many real world applications. For example one example of text classification would be an automated call centre which would like to categorise the complaints automatically into the most appropriate bucket of problems.

Text classification is a sub-problem of a more general problem of classification. In this application, the input is represented with a piece of text (rather than images, sounds, videos etc). The output could be:

binary (binary classification)
one category out of k possible categories (multi-class)
a set of categories out of k possible categories (multi-label).

In text classification, the feature extracted from the text are usually sparse (instead of dense, like in image classification).

1694 questions

votes

1 answer

Incremental/online learning using SGDClassifier partial_fit method

I have build a incremental learning model but not sure whether it is right or wrong i have 2 training data first consist 20000 rows and second consist 10000 rows both of them having two columns description and id...in case of offline learning my…

asked Oct 10 '17 at 11:32

outlier

votes

1 answer

architecture: building text suggestions based on existing text

trying build a text suggestions similar to stackoverflow question suggestions. Do not know where to start from. Any suggestions what tools/servers/algorithms i should be researching. Does this come under text classification? Any links towards this…

text-classification

asked Sep 22 '17 at 02:31

kumar

8,207
20
85
176

votes

1 answer

CNN converges to same accuracy regardless of hyperparameters, what does this indicate?

I have written tensorflow code based on: http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/ but using precomputed word embeddings from the GoogleNews word2vec 300 dimension model. I created my own data from the…

tensorflow conv-neural-network text-classification

asked Sep 19 '17 at 12:11

Kevinj22

votes

0 answers

Classify text based on location/time from an establishing mention

Given a longer English text (> a few paragraphs), is there a rule-based NLP approach to classifying a set of text to be occurring at a place or time, from an establishing phrase? For example: Alice went to London. She met Bob at his hotel, and they…

nlp stanford-nlp text-mining text-classification information-extraction

asked Sep 07 '17 at 14:12

Joe Mitchell

votes

1 answer

Why are Logistic Regression and SVM predictions multiplied by constants at the end?

I'm currently trying to understand certain high-level classification problems and have come across some code from a Kaggle competition that ran in 2012. The competition discussion board are (here) and the winning code is (here). At almost the end of…

python text-classification kaggle

asked Aug 25 '17 at 13:19

salvu

votes

1 answer

how to classify input text under different categories

text= "my dog is a rice eater", "I want to buy an a new","my cat prefers chocolate milk" how could I extract keywords from these text (or text corpora) and classify them in different categories (i.e. dog, cat be categorized as Pet and rice,…

r text-classification

asked Aug 22 '17 at 11:27

mzhasan

votes

1 answer

Extract topics from SMS messages

I have a dataset of SMS messages which is ill formatted and sparse. I tried to use topic modeling to get all the possible topics in each message with the probability of each associated topic. I need the probability to be able to arrange or rank each…

machine-learning text-classification topic-modeling multilabel-classification multiclass-classification

asked Aug 19 '17 at 08:10

user3379762

votes

0 answers

Dealing With Variations In Sparse Matrix

I am working on a text data structuring. I need to make predictions using the following format: xyz@gmail.com -> Email, India -> Country, etc.... To achieve that, SVC along with OneVsRestClassifier is being used. The data extrapolation works just…

python python-2.7 machine-learning scikit-learn text-classification

asked Aug 17 '17 at 12:29

RAMASWAMY M

votes

1 answer

Using feature selection with LinearSVC in python

I have a task to create a multi class classifier for product titles to classify them into 11 categories. I'm using scikit's LinearSVC for classification. I preprocessed the product titles first by removing stopwords, using POS tags for…

python scikit-learn pipeline text-classification feature-selection

asked Aug 14 '17 at 02:30

akrama81

votes

0 answers

Multiclass text classifcation

I have a question regarding a project in which I have to classify text. In this project I have several thousand questions (strings), which should be put into the categories tech, sports, politics, history, science and geography. My training data…

python machine-learning svm text-classification

asked Aug 12 '17 at 13:01

James No

votes

0 answers

Trying to get more performance on a text classification task

Right now I am working on a text classification (trying to predict if a Twitter response is even human or bot generated). The task is actually a closed kaggle competition, and more details as well as datasets that were used could be found here:…

python machine-learning scikit-learn text-classification

asked Aug 09 '17 at 11:55

robertandrei

votes

1 answer

Found array with dim 3. Estimator expected <= 2

I am using LDA over a simple collection of documents. my goal is to extract topics, then use the extracted topics as features to evaluate my model. I decided to use multinomial SVM as the evaluater. not sure its good or not? import itertools from…

python machine-learning svm text-classification lda

asked Aug 08 '17 at 18:01

sariii

2,020
6
29
57

votes

2 answers

Handling new features in classification models

I’m taking my first steps in ML, specifically with classifiers for text sentiment analysis. My approach is to make the usual 80% train dataset and 20% test. Having a trained model what is the best way to proceed in a production environment when new…

machine-learning sentiment-analysis text-classification

asked Aug 07 '17 at 17:07

João Cunha

votes

1 answer

Performance: Improve Accuracy of a Naive Bayes Classifier

I am working on a simple Naive Bayes Text Classifier which uses the Brown Corpus for test and training data. So far, I have gotten an accuracy of 53% when using the simple approach without any preprocessing. In order to improve my classifier, I've…

python nltk text-classification naivebayes stemming

asked Aug 05 '17 at 15:00

LittleEntertainer

votes

1 answer

Feedback in NaiveBayes Text Classification

I am a newbie in machine Learning, i am building a complaint categorizer and i want to provide a feedback model so that it can improve over time import numpy from sklearn.feature_extraction.text import CountVectorizer from sklearn.naive_bayes import…

machine-learning nlp classification text-classification

asked Aug 02 '17 at 12:28

Aditya Agarwal

Prev 1 2 3

…

99 100 Next