sklearn's roc_curve() function returns thresholds and fpr of different dimensions

Question

I assume that roc_curve() computes fpr and tpr for each value of thresholds. But the following code shows that fpr and thresholds have different dimensions.

from sklearn.metrics import roc_curve
fpr,tpr,thresholds = roc_curve(y_train_5,y_scores)

fpr.shape #(3908,)
thresholds.shape #(59966,)

I am also wondering why

precisions,recalls,thresholds = precision_recall_curve(y_train_5,y_scores)
precisions #(59967,)
thresholds #(59966,)

precisions's dimension differs from thresholds' by one?

amiola · Accepted Answer · 2021-02-20T12:07:46.980

1

For what concerns roc_curve(), differently than for precision/recall curves, the lengths of the outputs do depend on drop_intermediate option (default to True), meant for dropping suboptimal thresholds (see here for reference).

For the second point, the threshold is not outputted anymore whenever full recall is achieved. This might be the reason; this link or this link might help, too.

edited Feb 20 '21 at 12:07

answered Feb 19 '21 at 15:22

amiola

2,593
1
11
25

sklearn's roc_curve() function returns thresholds and fpr of different dimensions

1 Answers1

Linked