Questions tagged [text-classification]

Simply stating, text classification is all about putting a piece of text into a set of (mostly predefined) categories. This is one of the most important problems which occurs in many real world applications. For example one example of text classification would be an automated call centre which would like to categorise the complaints automatically into the most appropriate bucket of problems.

Text classification is a sub-problem of a more general problem of classification. In this application, the input is represented with a piece of text (rather than images, sounds, videos etc). The output could be:

binary (binary classification)
one category out of k possible categories (multi-class)
a set of categories out of k possible categories (multi-label).

In text classification, the feature extracted from the text are usually sparse (instead of dense, like in image classification).

1694 questions

votes

2 answers

TensorFlow - Text Classification using Neural Networks

Is there any example on how can TensorFlow be used for text classification using neural networks?

text-classification tensorflow

asked Nov 14 '15 at 05:00

Sumit Chawla

votes

2 answers

Testing the NLTK classifier on specific file

The following code run Naive Bayes movie review classifier. The code generate a list of the most informative features. Note: **movie review** folder is in the nltk. from itertools import chain from nltk.corpus import stopwords from…

python-2.7 nlp classification nltk text-classification

asked Mar 27 '15 at 13:34

ZaM

votes

3 answers

Dealing with class imbalance in multi-label classification

I've seen a few questions on class imbalance in a multiclass setting. However, I have a multi-label problem, so how would you deal with it in this case? I have a set of around 300k text examples. As mentioned in the title, each example has at least…

machine-learning classification text-classification vowpalwabbit

asked Dec 09 '13 at 00:55

richizy

2,002
3
21
26

votes

1 answer

Naive Bayes probability always 1

I started using sklearn.naive_bayes.GaussianNB for text classification, and have been getting fine initial results. I want to use the probability returned by the classifier as a measure of confidence, but the predict_proba() method always returns…

python scikit-learn text-classification

asked Aug 05 '13 at 14:05

AviM

votes

1 answer

Unable to train my keras model : (Data cardinality is ambiguous:)

I am using the bert-for-tf2 library to do a Multi-Class Classification problem. I created the model but training throws the following error: --------------------------------------------------------------------------- ValueError …

machine-learning nlp text-classification tensorflow2.0 tf.keras

asked Dec 03 '19 at 11:04

Amal Vijayan

votes

3 answers

Improving on the basic, existing GloVe model

I am using GloVe as part of my research. I've downloaded the models from here. I've been using GloVe for sentence classification. The sentences I'm classifying are specific to a particular domain, say some STEM subject. However, since the existing…

nlp text-classification glove

asked Apr 25 '17 at 18:15

cs95

379,657
97
704
746

votes

2 answers

GridSearchCV: How to specify test set?

I have a question regarding GridSearchCV: by using this: gs_clf = GridSearchCV(pipeline, parameters, n_jobs=-1, cv=6, scoring="f1") I specify that k-fold cross-validation should be used with 6 folds right? So that means that my corpus is split into…

python scikit-learn cross-validation text-classification

asked Nov 11 '16 at 10:37

user3629892

2,960
9
33
64

votes

1 answer

How to classify URLs? what are URLs features? How to select and Extract features from URL

I have just started to work on a Classification problem. Its a two class problem, My Trained model(Machine Learning) will have to decide/predict either to allow a URL or Block it. My Question is very specific. How to Classify URLs? Should i use…

url machine-learning classification feature-extraction text-classification

asked Oct 20 '14 at 00:22

Nasir

votes

1 answer

Pre-Trained models for text Classification

So I have few words without labels but I need to classify them into 4-5 categories. I can visibly say that this test set can be classified. Although I do not have training data so I need to use a pre-trained model to classify these words. Which…

python machine-learning keras text-classification pre-trained-model

asked Dec 12 '20 at 08:05

scifi_bot

votes

1 answer

Passing multiple sentences to BERT?

I have a dataset with paragraphs that I need to classify into two classes. These paragraphs are usually 3-5 sentences long. The overwhelming majority of them are less than 500 words long. I would like to make use of BERT to tackle this problem. I am…

nlp text-classification bert-language-model huggingface-transformers

asked Nov 17 '20 at 18:50

jhfodr76

votes

1 answer

How can I use GPT 3 for my text classification?

I am wondering if I can be able to use OpenAI GPT-3 for transfer learning in a text classification problem? If so, how can I get start on it using Tensorflow, Keras.

keras text-classification transfer-learning openai-api gpt-3

asked Aug 09 '20 at 02:17

anveshtummala

votes

1 answer

Sliding window for long text in BERT for Question Answering

I've read post which explains how the sliding window works but I cannot find any information on how it is actually implemented. From what I understand if the input are too long, sliding window can be used to process the text. Please correct me if I…

nlp text-classification huggingface-transformers nlp-question-answering bert-language-model

asked Jul 19 '20 at 10:18

Benj

votes

2 answers

No batch_size while making inference with BERT model

I am working on a binary classification problem with Tensorflow BERT language model. Here is the link to google colab. After saving and loading the model is trained, I get error while doing the prediction. Saving the Model def…

python tensorflow machine-learning deep-learning text-classification

asked Jul 02 '19 at 06:06

joel

1,156
3
15
42

votes

1 answer

Keras Embedding Layer: keep zero-padded values as zeros

I've been thinking about 0-padding of word sequence and how that 0-padding is then converted to the Embedding layer. At first glance, one would think that you want to keep the embeddings = 0.0 as well. However, Embedding layer in keras generates…

machine-learning keras text-classification word-embedding zero-padding

asked Jun 27 '19 at 20:51

David Makovoz

1,766
2
16
27

votes

1 answer

How to recognize entities in text that is the output of optical character recognition (OCR)?

I am trying to do multi-class classification with textual data. Problem I am facing that I have unstructured textual data. I'll explain the problem with an example. consider this image for example: I want to extract and classify text information…

nlp recurrent-neural-network text-classification named-entity-recognition named-entity-extraction

asked Mar 03 '19 at 10:52

Vivek Mehta

2,612
2
18
30

Prev 1 2 3

…

99 100 Next