8

I have a question related to the roc_curve from scikit-learn for a deep learning exercise, I have noticed that my data has 1 as the positive label. After my training the testing accuracy is coming around 74% but the roc area under curve(AUC) score coming as only as .24.

y_pred = model.predict([x_test_real[:, 0],x_test_real[:, 1]])
fpr, tpr, thresholds = metrics.roc_curve(y_test_real, y_pred,pos_label=1)
roc_auc = metrics.auc(fpr, tpr)
print("roc_auc:  %0.2f" % roc_auc)

If I change the pos_label to 0. The auc score becomes 0.76(obviously)

y_pred = model.predict([x_test_real[:, 0],x_test_real[:, 1]])
fpr, tpr, thresholds = metrics.roc_curve(y_test_real, y_pred,pos_label=0)
roc_auc = metrics.auc(fpr, tpr)
print("roc_auc:  %0.2f" % roc_auc)

Now I ran a small experiment, I changed my training and testing labels(which are binary classification)

y_train_real = 1 - y_train_real
y_test_real = 1 - y_test_real

like this, which should flip the positive and negative labels from 1 to 0. Then I run my code again. This time expecting the behavior of the roc auc to flip as well. But NO!

fpr, tpr, thresholds = metrics.roc_curve(y_test_real, y_pred,pos_label=0)

Is still giving .80 and with pos_label=1 is giving .2. This is confusing me,

  • If I change the positive label in my training target should it not affect the roc_curve auc values??
  • Which case is the correct analysis
  • Does the output has anything to do with the loss function used? I am solving a binary classification problem of match and not match using "contrastive loss"

Can anyone help me here? :)

teamyates
  • 414
  • 1
  • 7
  • 25
Arka Mallick
  • 1,206
  • 3
  • 15
  • 28

1 Answers1

0

It would be great if you could post code and output like this -

import numpy as np
from sklearn import metrics

y_pred = np.random.rand(100,)

y_true = np.random.randint(0,2,(100,))

fpr, tpr, thresholds = metrics.roc_curve(y_true, y_pred, pos_label=1)
print(metrics.auc(fpr, tpr))

fpr, tpr, thresholds = metrics.roc_curve(y_true, y_pred, pos_label=0)
print(metrics.auc(fpr, tpr))

y_true_new = 1 - y_true

fpr, tpr, thresholds = metrics.roc_curve(y_true_new, y_pred, pos_label=1)
print(metrics.auc(fpr, tpr))

fpr, tpr, thresholds = metrics.roc_curve(y_true_new, y_pred, pos_label=0)
print(metrics.auc(fpr, tpr))

Output -

0.5291047771979125
0.4708952228020875
0.4708952228020875
0.5291047771979125
nithish08
  • 468
  • 2
  • 7