Questions tagged [classification]

In machine learning and statistics, classification is the problem of identifying which of a set of categories a new observation belongs to, on the basis of a training set of data containing observations whose category membership (label) is known.

In machine learning and statistics, classification refers to the problem of predicting category memberships based on a set of pre-labeled examples. It is thus a type of supervised learning.

Some of the most important classification algorithms are support vector machines svm, logistic regression, naive Bayes, random forest random-forest and artificial neural networks neural-network.

When we wish to associate inputs with continuous values in a supervised framework, the problem is instead known as regression. The unsupervised counterpart to classification is known as clustering (or cluster analysis), and involves grouping data into categories based on some measure of inherent similarity.

7859 questions

votes

9 answers

How to get most informative features for scikit-learn classifiers?

The classifiers in machine learning packages like liblinear and nltk offer a method show_most_informative_features(), which is really helpful for debugging features: viagra = None ok : spam = 4.5 : 1.0 hello = True ok :…

python machine-learning classification scikit-learn

asked Jun 20 '12 at 09:36

tobigue

3,557
3
25
29

votes

2 answers

What is out of bag error in Random Forests?

What is out of bag error in Random Forests? Is it the optimal parameter for finding the right number of trees in a Random Forest?

language-agnostic machine-learning classification random-forest

asked Aug 30 '13 at 21:46

csalive

votes

4 answers

How to interpret weka classification?

How can we interpret the classification result in weka using naive bayes? How is mean, std deviation, weight sum and precision calculated? How is kappa statistic, mean absolute error, root mean squared error etc calculated? What is the…

classification weka

asked May 25 '10 at 10:55

user349821

votes

7 answers

Loss function for class imbalanced binary classifier in Tensor flow

I am trying to apply deep learning for a binary classification problem with high class imbalance between target classes (500k, 31K). I want to write a custom loss function which should be like:…

classification tensorflow

asked Feb 02 '16 at 14:07

Venkata Dikshit Pappu

votes

2 answers

Why does prediction needs batch size in Keras?

In Keras, to predict class of a datatest, the predict_classes() is used. For example: classes = model.predict_classes(X_test, batch_size=32) My question is, I know the usage of batch_size in training, but why does it need a batch_size for…

neural-network classification keras

asked Jun 19 '16 at 20:04

malioboro

3,097
4
35
55

votes

2 answers

What is the difference between a sigmoid followed by the cross entropy and sigmoid_cross_entropy_with_logits in TensorFlow?

When trying to get cross-entropy with sigmoid activation function, there is a difference between loss1 = -tf.reduce_sum(p*tf.log(q), 1) loss2 = tf.reduce_sum(tf.nn.sigmoid_cross_entropy_with_logits(labels=p, logits=logit_q),1) But they are the…

machine-learning tensorflow classification cross-entropy sigmoid

asked Sep 19 '17 at 03:23

D.S.H.J

votes

19 answers

scikit learn output metrics.classification_report into CSV/tab-delimited format

I'm doing a multiclass text classification in Scikit-Learn. The dataset is being trained using the Multinomial Naive Bayes classifier having hundreds of labels. Here's an extract from the Scikit Learn script for fitting the MNB model from __future__…

python csv text scikit-learn classification

asked Sep 23 '16 at 13:45

Seun AJAO

votes

5 answers

XGBoost XGBClassifier Defaults in Python

I am attempting to use XGBoosts classifier to classify some binary data. When I do the simplest thing and just use the defaults (as follows) clf = xgb.XGBClassifier() metLearn=CalibratedClassifierCV(clf, method='isotonic', cv=2) metLearn.fit(train,…

python scikit-learn classification analytics xgboost

asked Jan 08 '16 at 10:30

Chris Arthur

1,139
2
10
11

votes

3 answers

Save Naive Bayes Trained Classifier in NLTK

I'm slightly confused in regard to how I save a trained classifier. As in, re-training a classifier each time I want to use it is obviously really bad and slow, how do I save it and the load it again when I need it? Code is below, thanks in advance…

python machine-learning classification nltk naivebayes

asked Apr 04 '12 at 18:24

user179169

votes

2 answers

How is the feature score(/importance) in the XGBoost package calculated?

The command xgb.importance returns a graph of feature importance measured by an f score. What does this f score represent and how is it calculated? Output: Graph of feature importance

python r classification feature-selection xgboost

asked Dec 11 '15 at 07:30

ishido

4,065
9
32
42

votes

6 answers

Options for deploying R models in production

There doesn't seem to be too many options for deploying predictive models in production which is surprising given the explosion in Big Data. I understand that the open-source PMML can be used to export models as an XML specification. This can then…

r deployment machine-learning classification pmml

asked Mar 10 '14 at 19:15

Cybernetic

12,628
16
93
132

votes

1 answer

Different decision tree algorithms with comparison of complexity or performance

I am doing research on data mining and more precisely, decision trees. I would like to know if there are multiple algorithms to build a decision trees (or just one?), and which is better, based on criteria such as Performance Complexity Errors in…

performance machine-learning complexity-theory classification decision-tree

asked Apr 02 '12 at 15:45

Y2theZ

10,162
38
131
200

votes

4 answers

What is the difference between back-propagation and feed-forward Neural Network?

What is the difference between back-propagation and feed-forward neural networks? By googling and reading, I found that in feed-forward there is only forward direction, but in back-propagation once we need to do a forward-propagation and then…

machine-learning neural-network classification backpropagation

asked Feb 09 '15 at 06:11

USB

6,019
15
62
93

votes

8 answers

Error in Confusion Matrix : the data and reference factors must have the same number of levels

I've trained a Linear Regression model with R caret. I'm now trying to generate a confusion matrix and keep getting the following error: Error in confusionMatrix.default(pred, testing$Final) : the data and reference factors must have the same…

r machine-learning artificial-intelligence classification linear-regression

asked May 02 '15 at 11:57

abcd

votes

1 answer

What is the difference between cross-entropy and log loss error?

What is the difference between cross-entropy and log loss error? The formulae for both seem to be very similar.

machine-learning classification cross-entropy

asked Jun 18 '18 at 16:07

user3303020

Prev 1

…

99 100 Next