0

I am quite new to machine learning and python. Any help would be appreciated.

usually in Matlab, it's easy to plot it. I want to draw the roc curve to evaluate the performance of the face recognition system, i calculate the euclidian distance and the cosine similarity between two images and i would like to apply the computation of its two parameters on a database ( test train). how can I draw the roc curve on this is database images

and how can i measure the performance of autoencoder?

this code doesn't work :

predictions_prob = your_model.predict_proba(x_test)
false_positive_rate, recall, thresholds = roc_curve(y_test, predictions_prob[:,1])
roc_auc = auc(false_positive_rate, recall)
plt.plot(false_positive_rate, recall, 'g', label = 'AUC %s = %0.2f' % ('model name', roc_auc))
plt.plot([0,1], [0,1], 'r--')
plt.legend(loc = 'lower right')
plt.ylabel('Recall')
plt.xlabel('Fall-out')
plt.title('ROC Curve')

this is pre-trained models weights

so now i have two array y_true if two face are similar '1' or if not '0'

y_true [0 1 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0]

y_score array representy_score

[0.43031937 0.09115553 0.00650781 0.02242869 0.38608587 0.09407699
 0.40521139 0.08062053 0.37445426 0.73493853 0.7103999  0.72978038
 0.66644344 0.63952136 0.61384821 0.58388719 0.64563826 0.7302449
 0.50854671 0.74351138 0.74457312 0.86807218 0.83802608 0.74165669
 0.74858481 0.76547028 0.73587325 0.78119443 0.59438175 0.74271324
 0.65287331 0.55672997 0.6840947  0.86698833 0.69892132 0.9039218
 0.73688647 0.88281097 0.65161654 0.6082072  0.60127196 0.59740826
 0.63763261 0.60536379 0.642178   0.61151108 0.62726742 0.61947313
 0.67193428 0.7865534  0.65491107 0.6640633  0.68394253 0.63343072
 0.79708609 0.78625438 0.70690271 0.75213048 0.76652744 0.85628764
 0.82893997 0.75122409 0.76786727 0.7644964  0.75824204 0.78366616
 0.65271395 0.75293976 0.72236988 0.56250972 0.72455084 0.9160955
 0.74614334 0.94117467 0.75922103 0.91618422]

when i run the code i get this plot:

plot

what should i change scores labels I'm lost any help will appreciate it.

I don't know why i get only 4 elements in tpr and fpr and threshold

fpr [0. 0. 0. 1.]
tpr [0.  0.2 1.  1. ]
thresholds [1.99308544 0.99308544 0.90004301 0.        ]
Guizmo Charo
  • 71
  • 2
  • 10

1 Answers1

1

Assuming y_test is a numpy array containing 0 and 1, in which 0 means the two faces are not the same(negative), 1 means the two faces are the same(positive).

Also assuming you use verifyFace in prediction. Let's say it's output is pred, which contains distance between each pairs.

By definition, two faces lower than a threshold will be considered positive. This is just the opposite of typical binary classification task.

So here is a workaround:

from sklearn.metrics import roc_curve, auc
import numpy as np
import matplotlib.pyplot as plt

n_samples = 1000
pred = np.random.randn(n_samples)
y_test = np.random.randint(2, size=(n_samples,))

max_dist = max(pred)
pred = np.array([1-e/max_dist for e in pred])
fpr, tpr, thresholds = roc_curve(y_test, pred)
roc_auc = auc(fpr, tpr)
plt.figure()
lw = 2
plt.plot(fpr, tpr, color='darkorange',
         lw=lw, label='ROC curve (area = %0.2f)' % roc_auc)
plt.plot([0, 1], [0, 1], color='navy', lw=lw, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver operating characteristic example')
plt.legend(loc="lower right")
plt.show()

The key concept is to convert pred so it looks like a sequence of confidence.

Ref: How to use prediction score in creating ROC curve with Scikit-Learn

Receiver Operating Characteristic (ROC)

keineahnung2345
  • 2,635
  • 4
  • 13
  • 28
  • thanks alot for your amazing answer but what you mean by pred distance ==> euclidean distance or cosine similarity idid what you told me and i think the plot of the roc curve is wrong because auc value is equal to 100 1.00 – Guizmo Charo Feb 21 '19 at 00:17
  • 1
    @GuizmoCharo By distance I mean a metric which two samples will be considered to belong a class if their distance is small, so it could be euclidean distance or cosine `distance`. I've added the supposed `pred` and `y_test`, does yours look like that? – keineahnung2345 Feb 21 '19 at 00:35
  • thanks again for your reply i posted my code i did euclidean distance and cosine similarity and it didn't work please cheack it i know there is a small problem in my code but i did not where i'm really lost – Guizmo Charo Feb 21 '19 at 09:40
  • 1
    I am assuming your `verifyFace` returns distance between pairs, but it looks like it returns cosine similarity. To use it with my answer, you should remove the `1-` from `pred = np.array([1-e/max_dist for e in pred])`. – keineahnung2345 Feb 22 '19 at 00:23
  • Verifyface return two value the first value is cosine similarity the second is label if two face are the same that mean label 1 else 0 ,So i'm in the right way ,if i remove i get auc equal to 0 so i should put a pos_label =0 to get auc equal to 1 i'm just confused about my roc curve equal why auc is equal only to 1 or 0 or my result is true for evrey test idid – Guizmo Charo Feb 22 '19 at 11:43
  • i tested my two vector to get the precision recall i get average PRecall scores 0.04 !!! – Guizmo Charo Feb 22 '19 at 12:37