Questions tagged [text-classification]

Simply stating, text classification is all about putting a piece of text into a set of (mostly predefined) categories. This is one of the most important problems which occurs in many real world applications. For example one example of text classification would be an automated call centre which would like to categorise the complaints automatically into the most appropriate bucket of problems.

Text classification is a sub-problem of a more general problem of classification. In this application, the input is represented with a piece of text (rather than images, sounds, videos etc). The output could be:

binary (binary classification)
one category out of k possible categories (multi-class)
a set of categories out of k possible categories (multi-label).

In text classification, the feature extracted from the text are usually sparse (instead of dense, like in image classification).

1694 questions

votes

1 answer

Classifying words inside a document

The problem that I'm facing is: I want to read a document, get the raw string of this document, and classify the information. For example, I want to identify when the string is a "Name", or a "date" ou some other useful information. Is it possible…

machine-learning text-classification

asked Jun 02 '16 at 13:23

Eduardo Briguenti Vieira

4,351
3
37
49

votes

0 answers

Naive Bayes Classification: Understanding example correctly?

I am currently looking into the multinomial model for Naive Bayes classification, and have come across the following example: I think I understand everything, but I have developed the following reasoning I would like confirmed: For a given class…

algorithm information-retrieval text-classification naivebayes multinomial

asked May 29 '16 at 18:11

yulai

votes

1 answer

Classification of sparse data

I am struggling with the best choice for a classification/prediction problem. Let me explain the task - I have a database of keywords from abstracts for different research papers, also I have a list of journals with specified impact factors. I want…

python r classification data-mining text-classification

asked May 28 '16 at 10:19

tretyacv

votes

1 answer

Should I use word2vec to do word embedding including testing data?

I am a new people in NLP and I am try do the text classification job. Before doing the job, I know that we should do word embedding. My question is should I do word embedding job only on training data (so that testing data get vector just from…

machine-learning nlp text-classification word2vec word-embedding

asked May 22 '16 at 03:43

Nils Cao

1,409
2
15
23

votes

2 answers

Should I remove stopwords when feed sentence to RNN

In bag-of-words model, I know we should remove stopwords and punctuation before training. But in RNN model, if I want to do text classification, should I remove stopwords too ?

machine-learning nlp deep-learning text-classification recurrent-neural-network

asked May 19 '16 at 14:13

Nils Cao

1,409
2
15
23

votes

1 answer

Weka Classification Project Using StringToWordVector and SMO

I am working on a project in which I have about 18 classes with about 4,000 total instances. I have 7 attributes, 1 being string data, the rest nominal. I am currently using StringToWordVector on the string attribute with Platt's SMO classifier,…

classification weka text-classification document-classification multilabel-classification

asked May 18 '16 at 21:17

Will Ebert

votes

2 answers

Scikit-learn: How to extract features from the text?

Assume I have an array of Strings: ['Laptop Apple Macbook Air A1465, Core i7, 8Gb, 256Gb SSD, 15"Retina, MacOS' ... 'another device description'] I'd like to extract from this description features like: item=Laptop brand=Apple model=Macbook Air…

machine-learning scikit-learn text-classification

asked May 15 '16 at 10:14

Novitoll

votes

1 answer

Naive Bayes unseen features handling scikit learn

I am classifying small texts (tweets) using Naive Bayes (MultinominalNB) in scikit-learn. My train data has 1000 features, and my test data has 1200 features. Let's say 500 features are common for both train and test data. I wonder why…

machine-learning scikit-learn text-classification naivebayes

asked May 08 '16 at 21:41

Uylenburgh

1,277
4
20
46

votes

1 answer

Text-Classification: Bag of words with MinMax-Scaler

I try classify documents based on their bag of words representation (Features: 1000). For the classification, I am using a SVM, it seems that sometimes the SVM doesn't terminate and runs endlessly. (Running sci-kit: SVC(C=1.0,kernel='linear',…

machine-learning scikit-learn text-classification

asked May 03 '16 at 12:58

jobooo

votes

1 answer

Problems with Naive Bayes

I'm trying to run Naive Bayes in R for making predictions from textual data (by building a Document Term Matrix). I read several posts warning about terms that could be missing in both the training and the testing set, so I decided to work with only…

r classification text-classification naivebayes

asked Apr 30 '16 at 19:39

JorgeF

votes

1 answer

GATE machine learning doesn't work

I want to use batch learning PR to conduct text classification in GATE. I firstly write this configure XML and it can work.

java machine-learning text-classification gate

asked Apr 26 '16 at 10:45

Fan Yang

votes

0 answers

Naïve Bayes Algorithm Always comes out as 0

I have a question regarding the Naïve Bayes classification method. I ran though what I thought was an easy example but ran into a snag. Basically here is the classification I would like to do: I want to be able to take some training data: input1 |…

artificial-intelligence classification probability text-classification naivebayes

asked Apr 17 '16 at 20:58

9er

1,604
3
20
37

votes

2 answers

I need to perform naive bayes text classification. Getting error while running the naiveBayes() method

I am getting an error while using naiveBayes() method in R. I am passing the the as.matrix(train_matrix)as first parameter and as.factor(train_data$subcategory) to the naiveBayes function. I am getting below error : model <-…

r text-classification naivebayes

asked Apr 14 '16 at 07:11

Madhav pandey

votes

3 answers

Machine learning text classification where a text belongs to 1 to N classes

So I am trying to (just for fun) classify movies based on their description, the idea is to "tag" movies, so a given movie might be "action" and "humor" at the same time for example. Normally when using a text classifier, what you get is the class…

machine-learning statistics text-classification naivebayes

asked Apr 13 '16 at 22:15

Juan Antonio Gomez Moriano

13,103
10
47
65

votes

1 answer

How GATE processes machine learning (text classification)?

Taking the following sentence as an exmaple (gotten from GATE official tutorial slide:module 11 https://gate.ac.uk/sale/talks/gate-course-may10/track-3/module-11-ml-adv/): I was told the item was in stock and next day delivery. After a couple of…

nlp text-mining text-classification gate

asked Mar 23 '16 at 09:48

Fan Yang

Prev 1 2 3

…

99 100 Next