I have the following code:
from sklearn.metrics import roc_curve, auc
actual = [1,1,1,0,0,1]
prediction_scores = [0.9,0.9,0.9,0.1,0.1,0.1]
false_positive_rate, true_positive_rate, thresholds = roc_curve(actual, prediction_scores, pos_label=1)
roc_auc = auc(false_positive_rate, true_positive_rate)
roc_auc
# 0.875
In this example the interpretation of prediction_scores
is straightforward namely, the higher the score the more confident the prediction is.
Now I have another set of prediction prediction scores. It is non-fractional, and the interpretation is the reverse. Meaning the lower the score more confident the prediction is.
prediction_scores_v2 = [10.3,10.3,10.2,10.5,2000.34,2000.34]
# so this is equivalent
My question is: how can I scale that in prediction_scores_v2
so that it gives
similar AUC score like the first one?
To put it another way, Scikit's ROC_CURVE requires the y_score
to be probability estimates of the positive class. How can I treat the value if the y_score
I have is probability estimates of the wrong class?