I'm using scikit-learn to perform binary classification, however the labels are not evenly distributed throughout the dataset. For cases where I'm interested in predicting the minority class, I have some concerns about the average precision metric provided by metrics.average_precision_score
. When I run the experiments, and print a classification report I see good performance on precision overall, but this is clearly from the model doing well on predicting the majority class, something like this:
precision recall f1-score support
label of interest 0.24 0.67 0.35 30
non-label 0.97 0.81 0.88 300
The average precision
is then reported as somewhere around 0.9752
. This average precision score is clearly being reported with respect to the majority class, which isn't really the class I'm interested in identifying. Is there some way to modify the metrics.average_precision_score
function to report the metric with respect to the minority class of interest? Any insight would be greatly appreciated - thanks for reading.