4

So i have a list of false positive rates and a list of true positive rates obtained by varying a certain threshold. I'm trying to calculate the auc score but unfortunately i can't use roc_auc_score method from scikit-learn so i'm using the more general auc method.

This is my code:

print(fpr_list)
#[0.4824561403508772, 0.4205607476635514, 0.41037735849056606, 0.391304347826087, 0.35467980295566504, 0.2857142857142857, 0.23195876288659795, 0.20618556701030927, 0.19170984455958548, 0.16753926701570682, 0.12105263157894737, 0.10052910052910052, 0.10052910052910052, 0.09523809523809523, 0.08465608465608465, 0.07936507936507936, 0.058823529411764705, 0.0481283422459893, 0.0427807486631016, 0.03208556149732621, 0.0374331550802139, 0.0213903743315508, 0.0213903743315508, 0.0213903743315508, 0.0213903743315508, 0.016042780748663103, 0.0106951871657754, 0.0106951871657754, 0.0106951871657754, 0.0106951871657754, 0.0106951871657754, 0.0106951871657754, 0.0106951871657754, 0.0106951871657754, 0.0106951871657754]
print(tpr_list)
#[0.7619047619047619, 0.7619047619047619, 0.7619047619047619, 0.7619047619047619, 0.7523809523809524, 0.7428571428571429, 0.7238095238095238, 0.6952380952380952, 0.6952380952380952, 0.6857142857142857, 0.6761904761904762, 0.6571428571428571, 0.6476190476190476, 0.6476190476190476, 0.638095238095238, 0.638095238095238, 0.6285714285714286, 0.6, 0.6, 0.6, 0.5904761904761905, 0.5904761904761905, 0.580952380952381, 0.580952380952381, 0.5714285714285714, 0.5714285714285714, 0.5714285714285714, 0.5714285714285714, 0.5333333333333333, 0.5333333333333333, 0.5142857142857142, 0.5047619047619047, 0.4952380952380952, 0.4952380952380952, 0.4857142857142857]
print(auc(fpr_list,tpr_list))

when I use auc method I get the error

ValueError: x is neither increasing nor decreasing

I understood the mistake, this is because in the lists there are equal values, but isn't there a way to ignore this error and still calculate the auc score?

JayJona
  • 469
  • 1
  • 16
  • 41

1 Answers1

1

You could use scipy.integrate to compute the area under the curve. However you will need to sort the fpr_list otherwise the dx will be negative and you might get an negative value for AUC.

from scipy import integrate
import numpy as np
sorted_index = np.argsort(fpr_list)
fpr_list_sorted =  np.array(fpr_list)[sorted_index]
tpr_list_sorted = np.array(tpr_list)[sorted_index]
integrate.trapz(y=tpr_list_sorted, x=fpr_list_sorted)
Ajay Verma
  • 610
  • 2
  • 12