-1

I am developing a machine learning scikit-learn model on an imbalanced dataset (binary classification). Looking at the confusion matrix and the F1 score, I expect a lower average precision score but I almost get a perfect score and I can't figure out why. This is the output I am getting:

Confusion matrix on the test set:

[[6792  199]
[   0  173]]

F1 score: 0.63

Test AVG precision score: 0.99

I am giving the avg precision score function of scikit-learn probabilities which is what the package says to use. I was wondering where the problem could be.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Joe
  • 29
  • 6

1 Answers1

3

The confusion matrix and f1 score are based on a hard prediction, which in sklearn is produced by cutting predictions at a probability threshold of 0.5 (for binary classification, and assuming the classifier is really probabilistic to begin with [so not SVM e.g.]). The average precision in contrast is computed using all possible probability thresholds; it can be read as the area under the precision-recall curve.

So a high average_precision_score and low f1_score suggests that your model does extremely well at some threshold that is not 0.5.

Ben Reiniger
  • 10,517
  • 3
  • 16
  • 29
  • Thanks so much! this makes sense. So, I went back in the model and found the model is doing extremely well at threshold of 0.97 instead of 0.5 so I changed the threshold. If the data is actually balanced, is it okay to move the threshold this drastically? – Joe May 03 '22 at 14:49
  • Your data _isn't_ balanced, as you've reported. Put the threshold wherever it makes business sense. It's surprising to me that the "optimal" threshold is so high and gives such a high score though; if your model can be extremely confident at that threshold, it should've been able to give more confident prediction probabilities? Maybe ask a followup question at stats.SE or datascience.SE giving more details about the model and the data? (As is this question is maybe off-topic on SO, but these followups definitely are.) – Ben Reiniger May 03 '22 at 15:11