Questions tagged [classification]

In machine learning and statistics, classification is the problem of identifying which of a set of categories a new observation belongs to, on the basis of a training set of data containing observations whose category membership (label) is known.

In machine learning and statistics, classification refers to the problem of predicting category memberships based on a set of pre-labeled examples. It is thus a type of supervised learning.

Some of the most important classification algorithms are support vector machines , logistic regression, naive Bayes, random forest and artificial neural networks .

When we wish to associate inputs with continuous values in a supervised framework, the problem is instead known as . The unsupervised counterpart to classification is known as (or cluster analysis), and involves grouping data into categories based on some measure of inherent similarity.

7859 questions
15
votes
2 answers

Can the Precision, Recall and F1 be the same value?

I am currently working on an ML classification problem and I'm computing the Precision, Recall and F1 using the sklearn library's following import and respective code as shown below. from sklearn.metrics import…
15
votes
2 answers

Object detection using Keras : simple way for faster R-CNN or YOLO

This question has maybe been answered but I didn't find a simple answer to this. I created a convnet using Keras to classify The Simpsons characters (dataset here). I have 20 classes and giving an image as input, I return the character name. It's…
A. Attia
  • 1,630
  • 3
  • 20
  • 29
15
votes
6 answers

Ordinal classification packages and algorithms

I'm attempting to make a classifier that chooses a rating (1-5) for a item i. For each item i, I have a vector x containing about 40 different quantities pertaining to i. I also have a gold standard rating for each item. Based on some function of…
tkerwin
  • 9,559
  • 1
  • 31
  • 47
15
votes
3 answers

How to use NLP to separate a unstructured text content into distinct paragraphs?

The following unstructured text has three distinct themes -- Stallone, Philadelphia and the American Revolution. But which algorithm or technique would you use to separate this content into distinct paragraphs? Classifiers won't work in this…
user193116
  • 3,498
  • 6
  • 39
  • 58
15
votes
2 answers

sklearn classifier get ValueError: bad input shape

I have a csv, struct is CAT1,CAT2,TITLE,URL,CONTENT, CAT1, CAT2, TITLE ,CONTENT are in chinese. I want train LinearSVC or MultinomialNB with X(TITLE) and feature(CAT1,CAT2), both get this error. below is my code: PS: I write below code through…
Mithril
  • 12,947
  • 18
  • 102
  • 153
15
votes
3 answers

Why vector normalization can improve the accuracy of clustering and classification?

It is described in Mahout in Action that normalization can slightly improve the accuracy. Can anyone explain the reason, thanks!
Meng Zhang
  • 337
  • 1
  • 4
  • 13
15
votes
2 answers

Weak Classifier

I am trying to implement an application that uses AdaBoost algorithm. I know that AdaBoost uses set of weak classifiers, but I don't know what these weak classifiers are. Can you explain it to me with an example and tell me if I have to create my…
14
votes
1 answer

Trove classifiers definition

I have bumped into this concept using Python distutils2/packaging. I did google it, but didn't fully grasp the idea, so would rather get a better explanation from someone more experienced to better assimilate the concept. "Trove classifiers are…
Joao Figueiredo
  • 3,120
  • 3
  • 31
  • 40
14
votes
5 answers

NLP and Machine learning for sentiment analysis

I'm trying to write a program that takes text(article) as input and outputs the polarity of this text, weather its a positive or a negative sentiment. I've read extensively about different approaches but i am still confused. I read about many…
14
votes
5 answers

Java machine learning library for commercial use?

Does anyone know a good Java machine learning library I can use for a commercial product? Weka and Rapidminer unfortunately do not allow this. I already found Apache Mahout and Java Data Mininng Package. Has anyone experience with them and provide…
WorstCase
  • 325
  • 4
  • 13
14
votes
2 answers

Tensorflow Estimator: Cache bottlenecks

When following the tensorflow image classification tutorial, at first it caches the bottleneck of each image: def: cache_bottlenecks()) I have rewritten the training using tensorflow's Estimator. This really simplified all the code. However I want…
Paul Woitaschek
  • 6,717
  • 5
  • 33
  • 52
14
votes
3 answers

How to give a constant input to keras

My network has two time-series inputs. One of the input has a fixed vector repeating for every time step. Is there an elegant way to load this fixed vector into the model just once and use it for computation?
Yakku
  • 453
  • 1
  • 7
  • 14
14
votes
1 answer

Class weights in binary classification model with Keras

We know that we can pass a class weights dictionary in the fit method for imbalanced data in binary classification model. My question is that, when using only 1 node in the output layer with sigmoid activation, can we still apply the class weights…
Jie HE
  • 173
  • 1
  • 8
14
votes
3 answers

Classification report with Nested Cross Validation in SKlearn (Average/Individual values)

Is it possible to get classification report from cross_val_score through some workaround? I'm using nested cross-validation and I can get various scores here for a model, however, I would like to see the classification report of the outer loop. Any…
utengr
  • 3,225
  • 3
  • 29
  • 68
14
votes
2 answers

One-class classification with SVM in R

I'm using the package e1071 in R in order to build a one-class SVM model. I don't know how to do that and I neither find any example on the Internet. Could someone give an example code to characterize, for example, the class "setosa" in the "iris"…
dreamscollector
  • 175
  • 1
  • 1
  • 8