A technique in cross-validation where the data is partitioned into k subsets (or "folds"), where the first k-1 folds are used for training and the last fold for evaluation. The process is repeated k times, leaving out a different fold for evaluation each time.
Questions tagged [k-fold]
284 questions
1
vote
0 answers
How can I use k-fold with flow_from_directory and 2 ImageDataGenerators?
I am trying to make a Convolutional Neural Network that detects whether the eye of the user (captured via the computer's webcam live video) is opened or closed.
I used the MRL Eye Dataset of 84898 images (link: http://mrl.cs.vsb.cz/eyedataset), and…

Joseph Nasr
- 13
- 4
1
vote
0 answers
StratifiedKFold overfitting
I'm working on a multimodal classifier (text + image) using pytorch (only 2 classes).
Since I don't have a lot of data, i've decided to use StratifiedKFold to avoid overfitting.
I noticed a strange behavior on training/testing curves.
My training…

Sylvain Lejamble
- 62
- 10
1
vote
1 answer
Where should I define sklearn model in a kfold validation setup?
I am a novice in Machine Learning and I have confusion in K-fold cross-validation. When I write a fold for loop where exactly should I define the sklearn model (not the PyTorch model). I have seen some tutorial where they define the model inside the…

DINABANDHU BEHERA
- 11
- 1
1
vote
0 answers
ValueError: Supported target types are: ('binary', 'multiclass'). Got 'continuous' instead. python
I run the following code:
seed = 7
np.random.seed(seed)
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=seed)
for train, test in kfold.split(X, Y):
model = Sequential()
model.add(Dense(64, input_dim=12, activation='relu'))
…

Alexandros
- 71
- 1
- 7
1
vote
1 answer
What is the difference between a "normal" k-fold cross-validation using shuffle=True and a repeated k-fold cross-validation?
could anyone explain the difference between a "normal" k-fold cross-validation using the shuffle function, e.g.
kf = KFold(n_splits = 5, shuffle = True)
and a repeated k-fold cross-validation? Shouldn't they return the same results?
Having a hard…

JKnow
- 21
- 5
1
vote
2 answers
Random forest cross validated k fold with caret package R best auc
I have a fairly serious problem that I haven't been able to solve for many days!
I cannot understand exactly how the traincontrol function of the caret package works in R.
I need to cross validate (10-fold) a random forest and thought that the caret…

Jresearcher
- 297
- 3
- 13
1
vote
1 answer
How to plot k-fold cross validation in R
I have a model similar to the following, and I am wondering, is there a beautifull and effective way to to plot the folds to show the stability and performance of my model?
data(iris)
df=iris[,1:4]
con = trainControl(method="cv",…

tibi
- 69
- 7
1
vote
0 answers
Why NaN values are found in score from kfoldPredict?
Names = {'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'};
isCategoricalPredictor = [false, false, true, false, true, false, false, false];
% Use tree learner
template = templateTree('NumVariablesToSample', 'all',... % to analyse predictor…

hunterex
- 565
- 11
- 27
1
vote
1 answer
Different sample sizes in kfold between Pycharm and Spyder
I'm trying to classify texts into categories. I've developed the code which does this, but kfold sample sizes differ on Spyder and Pycharm, even though the code is exactly the same.
This is the code:
def baseline_model():
model = Sequential()
…

iso_9001_
- 2,655
- 6
- 31
- 47
1
vote
0 answers
the size of samples in each k-fold on GPU is different from the size of samples in each k-fold on CPU
I am running the same code training the same CNN model using the same dataset on GUP and CPU, and I am using k-fold cross validation in my code. The problem is that k-fold seems not working properly on GPU, because on CPU the number of samples that…

AFHG
- 11
- 1
1
vote
2 answers
(Stratified) KFold vs. train_test_split - What training data is used?
I am just a beginner in ML and try to understand what exactly is the advantage of (Stratified) KFold over the classic train_test_split.
The classic train_test_split uses exactly one part for training (in this case 75%) and one part for testing (in…

4ndy94
- 31
- 1
- 3
1
vote
2 answers
Cross validation and ROC curve using Matlab: how plot mean ROC curve?
I am using k-fold cross validation with k = 10. Thus, I have 10 ROC curves.
I would like to average between the curves. I can't just average the values on the Y axes (using perfcurve) because the vectors returned are not the same…

Antonio Mendes
- 103
- 7
1
vote
1 answer
Where to create model object in keras in K-Fold Cross validation?
Where to create Keras model object, inside K-fold loop, or outside?
please explain why your answer is true.
def model_def():
model = Sequential()
model.add(.... so on....)
model.compile(....so on ....)
return model
Case 1:-…

Bhuvan S
- 213
- 1
- 4
- 10
1
vote
3 answers
Why do we need to recreate the model every time?
Here I have this piece of python code, taken from SoloLearn,
scores = []
kf = KFold(n_splits=5, shuffle=True)
for train_index, test_index in kf.split(X):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index],…

Matte
- 73
- 6
1
vote
1 answer
How to split data into test and train after applying stratified k-fold cross validation?
I have already assigned columns to their specific k-fold using the following code:
from sklearn.model_selection import StratifiedKFold, train_test_split
# Stratified K-fold cross-validation
df['kfold'] = -1
df =…

yudhiesh
- 6,383
- 3
- 16
- 49