I am trying to plot the ROC curve to evaluate the accuracy of Isolation Forest for a Breast Cancer dataset. I calculated the True Positive rate (TPR) and False Positive Rate (FPR) from the confusion matrix. However, I do not understand how the TPR and FPR are in the form of matrices, instead of single integer values. And the ROC curve seems to work only with FPR and TPR in the form of matrices (I also tried to manually write the code for calculating FPR and TPR).
Are the TPR and FPR values always in the form of matrices?
Either way, my ROC curve comes out as a straight line. Why is it so?
Confusion Matrix :
from sklearn.metrics import confusion_matrix
cnf_matrix = confusion_matrix(y, y_pred_test1)
O/P :
> [[ 5 25]
> [ 21 180]]
True Positive and False Positive : (Also, why are these values directly taken from the confusion matrix?)
F_P = cnf_matrix.sum(axis=0) - np.diag(cnf_matrix)
F_N = cnf_matrix.sum(axis=1) - np.diag(cnf_matrix)
T_P = np.diag(cnf_matrix)
T_N = cnf_matrix.sum() - (FP + FN + TP)
F_P = F_P.astype(float)
F_N = F_N.astype(float)
T_P = T_P.astype(float)
T_N = T_N.astype(float)
O/P :
False Positive [21. 25.] False Negative [25. 21.] True Positive [ 5. 180.] True Negative [180. 5.]
TPR and FPR :
tp_rate = TP/(TP+FN)
fp_rate = FP/(FP+TN)
O/P :
TPR : [0.16666667 0.89552239] FPR [0.10447761 0.83333333]
ROC curve :
from sklearn import metrics
import matplotlib.pyplot as plt
plt.plot(fp_rate,tp_rate)
plt.show()
O/P :