3

I have binary classification in which one of the classes is almost 0.1 size of the other class.

I am using sklearn to create a model and evaluate it. I am using these two functions:

print(precision_recall_fscore_support(y_real,y_pred))

out: 
(array([0.99549296, 0.90222222]), # precision of the first class and the second class
 array([0.98770263, 0.96208531]), # recall of the first class and the second class
 array([0.99158249, 0.93119266]), # F1 score of the first class and the second class
 array([1789,  211]))             # instances of the first class and the second class

Which returns the precison,recal,fscore and support for each class

print(precision_score(y_real,y_pred),recall_score(y_real,y_pred))

out:
0.90222222 , 0.96208531 # precsion and recall of the model

Which returns the precision and recall of the prediction.

Why the precsion and recall function returns exactly the same value of the class with the less instances (here the class with 211 instances)?

desertnaut
  • 57,590
  • 26
  • 140
  • 166

2 Answers2

3

Looking closely at the documentation of both precision_score and recall_score you will see two arguments - pos_label, with a default value of 1, and average, with a default value of 'binary':

pos_label : str or int, 1 by default

The class to report if average='binary' and the data is binary.

average : *string, [None, ‘binary’ (default), ‘micro’, ‘macro’, ‘samples’, ‘weighted’]*

'binary':

Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.

In other words, as explained clearly in the docs, these two functions return respectively the precision and recall of one class only - the one designated with the label 1.

From what you show, it would seem that this class is what you call 'second class' here, and the results indeed are consistent with what you report.

In contrast, the precision_recall_fscore_support function, according to the docs (emphasis mine):

Compute precision, recall, F-measure and support for each class

In other words, there is nothing strange or unexpected here; there is no "overall" precision and recall, and they are always by definition computer per class. Practically speaking, and in imbalanced binary settings like here, they are usually computed for the minority class only.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
0

It may be due to the imbalanced dataset. You can try oversampling from the under-represented class or undersampling from the over-represented class, depending on the levels of variance in your data. I had a similar issue with imbalanced data and this article helped out:

Medium Article on imbalanced data

sog
  • 493
  • 4
  • 13
  • It could be that the within categories have similar variances in your predictor variables. Without knowing more about the data set, it is hard to say. Another solution may be to add more features (predictor variables) and retrain your model. Or if you have a lot of features, Scikit-learn has some feature selection functions that can help ensure you give your model the features which will be most helpful in differentiating the categories. – sog Nov 10 '20 at 22:39
  • This is a question about what exactly is returned by the sklearn functions `precision_score` and `recall_score`, and has nothing to do with oversampling, undersampling, variance, or retraining of the model with or without additional features. – desertnaut Nov 10 '20 at 23:26