I want to use roc_auc_score
to evaluate the performance of the classifier, but I'm not sure what is the right parameters to give it.
This is a description of this function in the documentation: documentation.
As you can see, it needs y_score
, which is the probability estimates of the positive class, but how to determine which class is positive? For example, when I use predict_proba
, which column should I use?
Now the way I use this function is as follows:
clf = SVC(
kernel = 'linear',
probability = True,
random_state = 1 )
clf.fit(train,train_Labels)
score = np.array(clf.predict_proba(test_values))
auc = roc_auc_score(test_Labels,score[:,1])
train_Labels
and test_Labels
are one-dimensional vectors with 0
in front and 1
behind:[0,0,0,1,1,1]
.
In train and test, one row represents a sample, and one column represents a feature.
It might not be appropriate to use predict_proba
, but there are special requirements in my project, so don't worry.
I want to know if the vectors I passed into the roc_auc_score
function as a positive probability is correct(y_true
and y_score
).
If there is anything unclear about the question, please ask me, I am a novice, please forgive me.