0

I just wanna clarify that what is the difference between

roc_auc_score(y_test,results.predict(X_test))

and

roc_auc_score(y_test,results.predict_proba(X_test)[:,1])

I know the latter one return the probability of class 0 for each test observation and also in plotting out roc_curve, predict_proba() should be used. But which is the right way to check a binary classification model performance in ROC? I use the former one currently. What does the latter one mean?

LUSAQX
  • 377
  • 2
  • 6
  • 19

1 Answers1

0

The second one is correct.
You predictions should be ranked, since ROC AUC only cares about ranking.

f311a
  • 65
  • 2
  • 6
  • Ok. But when I used the second one which returned a 0.75 ROC in my project, I check the confusion matrix thereafter, finding that the model classify all test cases into zero which is abundant in my dataset comparing to class 1. What is the problem underlining? On the other hand, I used the first one to calculate the ROC which returned 0.5. This seemed reasonable for the case the model return only one class value for all test observation. – LUSAQX Dec 07 '16 at 20:28