Questions tagged [classification]

In machine learning and statistics, classification is the problem of identifying which of a set of categories a new observation belongs to, on the basis of a training set of data containing observations whose category membership (label) is known.

In machine learning and statistics, classification refers to the problem of predicting category memberships based on a set of pre-labeled examples. It is thus a type of supervised learning.

Some of the most important classification algorithms are support vector machines , logistic regression, naive Bayes, random forest and artificial neural networks .

When we wish to associate inputs with continuous values in a supervised framework, the problem is instead known as . The unsupervised counterpart to classification is known as (or cluster analysis), and involves grouping data into categories based on some measure of inherent similarity.

7859 questions
2
votes
3 answers

How to combine binary classification and regression problems

Am trying to solve the question: A person might or might not like steaks, but that statistically depends on the person's age, ethnicity, gender, etc. A steak loving person might like their steaks from 0% cooked to 100% cooked, and seasoned with an…
2
votes
0 answers

How do you train a neural network to predict a set number of labels for every input?

I am working on a project where we are trying to classify gene expression using a neural network. We are using Keras. We have the sequences of 35000 genes. For each of these genes, we know how much they are expressed in 28 different tissues.…
2
votes
1 answer

Validation accuracy metrics reported by Keras model.fit log and Sklearn.metrics.confusion_matrix don't match each other

The problem is that the reported validation accuracy value I get from Keras model.fit history is significantly higher than the validation accuracy metric I get from sklearn.metrics functions. The results I get from model.fit are summarized…
2
votes
1 answer

Loss reduction in canned TF estimators

I use a TensorFlow canned estimator (LinearClassifier) to predict game actions from situations favourizing best scores. Scores are included in train_data and used as weight and passed as weight column in the estimator. I know weight values are…
GerardL
  • 81
  • 7
2
votes
1 answer

How does TextCategorizer.predict work with spaCy?

I've been following the spaCy quick-start guide for text classification. Let's say I have a very simple dataset. TRAIN_DATA = [ ("beef", {"cats": {"POSITIVE": 1.0, "NEGATIVE": 0.0}}), ("apple", {"cats": {"POSITIVE": 0, "NEGATIVE":…
Union find
  • 7,759
  • 13
  • 60
  • 111
2
votes
2 answers

How to validate a prediction in Keras

I am going through the Kaggle Digit Recognizer Tutorial and I'm trying to understand how all of this works. I would like to validate a predicted value. Basically, I have a prediction that's wrong, but I want to see what the actual value of that…
mwilson
  • 12,295
  • 7
  • 55
  • 95
2
votes
0 answers

How can I modify keras/tensorflow code to (not) take position in the picture into account?

I would like to classify images using Keras and Tensorflow in RStudio. In a normal image classification model - how can I tell Tensorflow to take or NOT to take into account the position within the picture? For example, if I want the model to say,…
Nicholas
  • 93
  • 1
  • 9
2
votes
3 answers

Cleveland heart disease dataset - can’t describe the class

I’m using the Cleveland Heart Disease dataset from UCI for classification but i don’t understand the target attribute. The dataset description says that the values go from 0 to 4 but the attribute description says: 0: < 50% coronary disease 1: >…
2
votes
2 answers

RandomForestClassifier in Multi-label problem - how it works?

How does the RandomForestClassifier of sklearn handle a multilabel problem (under the hood)? For example, does it brake the problem in distinct one-label problems? Just to be clear, I have not really tested it yet but I see y : array-like, shape =…
Outcast
  • 4,967
  • 5
  • 44
  • 99
2
votes
1 answer

How to calculate class weights for Random forests

I have datasets for 2 classes on which I have to perform binary classification. I chose Random forest as a classifier as it is giving me the best accuracy among other models. Number of datapoints in dataset-1 is 462 and dataset-2 contains 735…
user2273202
2
votes
0 answers

How does the gradient descent work when the one hot encoded labels are all zeros?

I've got a dataset of images of the vineyard taken from above I'd like to sub-classify, there are 3 big classes [ground,vegetal,vineyard] and some of them are divided into several subclasses : vegetal:[grass,flower], vineyard:[healthy, disease A,…
2
votes
1 answer

How should systematic uncertainties (up and down) in training data be handled in classification neural networks?

I have a classification neural network and nominal input data on which it is trained, however the input data has for each feature a systematic (up and down) uncertainty. How should the accuracy of the classifier be qualified and visualised using…
BlandCorporation
  • 1,324
  • 1
  • 15
  • 33
2
votes
1 answer

Using BERT in order to detect language of a given word

I have words in the Hebrew language. Part of them are originally in English, and part of them are 'Hebrew English', meaning that those are words that are originally from English but are written with Hebrew words. For example: 'insulin' in Hebrew is…
jonb
  • 845
  • 1
  • 13
  • 36
2
votes
1 answer

What change should I make in the Image Data Generator to solve the error?

I am trying to perform Mixup Augmentation (details here) but I am getting a value error as follows : ValueError: operands could not be broadcast together with shapes (47,128,128,3) (64,1,1,1) Following is the MixupImageDataGenerator class…
dravid07
  • 437
  • 3
  • 6
  • 16
2
votes
1 answer

How to get classification probabilities in Keras?

I am trying to get classification probabilities out of my trained Keras model but when I use the model.predict (or model.predict_proba) method, all I get is an array of this form: array([[0., 0., 0., 0., 0., 0., 0., 1., 0., 0.]], dtype=float32) So…