1

I want to do a prediction using k-fold cross validation and, in the end, store all the predictions in a file.

I am able to do a prediction and get the accuracy, this is how I did it:

cv1 = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1)

model = LogisticRegression()

scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv1, n_jobs=-1)
print('Accuracy: %.3f (%.3f)' % (mean(scores), std(scores)))

But I do not find any way or function that allows me to access the actual predictions. In the end, I want to get an output containing each data point's id and the predicted label.

I tried to find a way to access the predictions using

cross_val_predict(model, X, y, cv=cv1, method='predict')

but this function does not work when using RepeatedKFold cross validation.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
RToPython
  • 11
  • 1
  • What do you mean "does not work"? Do you get an error or what? Please do not answer here, edit & update your post accordingly. – desertnaut Aug 25 '21 at 10:28
  • With `RepeatedKFold(n_repeats=3, ...)`, you'll get three predicted values per row. Do you care whether the predictions are assigned to their repetition? Do you need to know the folds used for each repetition? – Ben Reiniger Aug 25 '21 at 14:29

0 Answers0