I am trying to do a binary class classification. Since I have a small dataset (275 samples), I have done Leave-one-out cross validation and want to get the average classification report and AUROC/AUPRC across all folds.
I have closely followed this link to arrive at my results, but I cannot understand what the code is doing in the last line.
for i in classifiers:
print(i)
originalclass = []
predictedclass = []
model=i
loo = LeaveOneOut()
print('Scores before feature selection')
scores = cross_val_score(model, subset, y,cv=loo,scoring=make_scorer(classification_report_with_accuracy_score))
print("CV score",np.mean(cross_val_score(model,subset,y,cv=loo,scoring='roc_auc')))
print(classification_report(originalclass, predictedclass))
print('Scores after feature selection')
X_reduced=feature_reduction_using_RFECV(model,subset,y)
scores = cross_val_score(model, X_reduced, y,cv=loo,scoring=make_scorer(classification_report_with_accuracy_score))
print("CV score",np.mean(cross_val_score(model,X_reduced,y,cv=loo,scoring='roc_auc')))
print(classification_report(originalclass, predictedclass))
Where exactly is the averaging happening in the above code? I am calculating the mean CV score and printing it. But the line after that confuses me the most. I am initializing originalclass and predictedclass variable in the beginning,but where is it being used before printing in the last line?
print(classification_report(originalclass, predictedclass))
Edited code
for i in classifiers:
print(i)
originalclass = y
model=i
loo = LeaveOneOut()
print('Scores before feature selection')
y_pred = cross_val_predict(model, subset, y, cv=loo)
print(classification_report(originalclass, y_pred))
print("CV score",np.mean(cross_val_score(model,subset,y,cv=loo,scoring='roc_auc')))
print(classification_report(originalclass, y_pred))
print('Scores after feature selection')
X_reduced=feature_reduction_using_RFECV(model,subset,y)
y_pred = cross_val_predict(model, X_reduced, y, cv=loo)
classification_report(originalclass, y_pred)
print("CV score",np.mean(cross_val_score(model,X_reduced,y,cv=loo,scoring='roc_auc')))
print(classification_report(originalclass, y_pred))