what is the prediction array in ROC curves in scikit

Question

import numpy as np
from sklearn import metrics
y = np.array([1, 1, 2, 2])
scores = np.array([0.1, 0.4, 0.35, 0.8])
fpr, tpr, thresholds = metrics.roc_curve(y, scores, pos_label=2)

I am doing link prediction using an algorithm and I have a test and training network.For a given node I have the following vector [1,0,1,0,0] which means the algorithm could rightly predict the 1st and third link and failed in the others. Now I want to measure the performance of the algorithm using ROC curve using scikit learn and in the tutorial I understood the y array which is same as my vector but what is the score array in the tutorial?

Assuming this as a simple binary problem with output 1 and 0. The scores will the predicted probability of the positive class (1) from the scikit-learn estimator. Look for `predict_proba()` or `decision_function()` in the estimator you are trying to use. For more understanding, please add necessary details to your question about the data, the classifier. — Vivek Kumar, Dec 01 '17 at 05:32

score 0 · Answer 1 · answered Dec 01 '17 at 20:44

From my understanding, you want to do link prediction using ROC curves.That means you want to test your algorithm for link prediction like a ML algorithm for classification.So the score array in the scikit learn would be the score returned by the link prediction algorithm for the new links predicted in the training network when compared with the test network.

what is the prediction array in ROC curves in scikit

1 Answers1