A technique in cross-validation where the data is partitioned into k subsets (or "folds"), where the first k-1 folds are used for training and the last fold for evaluation. The process is repeated k times, leaving out a different fold for evaluation each time.
Questions tagged [k-fold]
284 questions
1
vote
0 answers
Stratified KFold Cross Validation (Keras) ValueError: Found array with dim 4. Estimator expected <= 2
I need to cross validate a keras model using stratified kfold (multiclass task that is imbalanced). Is it possible to use x_train/y_train with imagedatagenerator (flow_from_directory) in (folds = list(StratifiedKFold(k, shuffle=True,…

LM-azmp
- 11
- 1
1
vote
1 answer
Is cross validation used for model selection?
So this is starting to confuse me a bit. Having for example the following code that trains a GLM model:
glm_sens = train(
form = target ~ .,
data = ABT,
trControl = trainControl(method = "repeatedcv", number = 5, repeats = 10, classProbs =…

Piet Hein
- 184
- 2
- 16
1
vote
1 answer
K-Fold Cross Validation on entire Dataset
I would like to know if my current procedure is correct, or I might be having data leaks.
After importing the dataset, I split with 80/20 ratio.
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.20, random_state=0,…

Diogo Silva
- 320
- 2
- 14
1
vote
1 answer
Probabilities from cross_val_predict using RepeatedStratifiedKFold 5*10
My Goal is to calculate the AUC, Specificity, Sensitivity with 95 % CI from a 5*10 StratifiedKfold CV. I also need the Specificity and Sensitivity for a Threshold of 0.4 to maximize the Sensitivity.
So far I was able to implement it for the AUC.…

Mischa
- 83
- 1
- 10
1
vote
0 answers
Naive Bayes NLTK Cross Validation
I have a problem with understanding how KFold Cross-Validation works in the new model selection version. I am using Naive Bayes classifier and I would like to test it using cross-validation. My test and train data are split like this:
test_set =…

Simm
- 89
- 1
- 10
1
vote
0 answers
How to use k-fold cross-validation with the 'patternnet' neural network in Matlab?
I'm trying to use k-fold cross-validation with the patternnet neural network.
inputs1 is a feature vector and targets1 is label vector from 'iris_dataset'. And xtrain, xtest, ytrain, and ytest are training & testing features and labels respectively…

Ellie
- 303
- 2
- 16
1
vote
0 answers
Error using R caret package (train) with C5.0 decision tree to do K-fold cross validation
NOW SOLVED.
The problem was data=OneT.train, which was wrong. This code was copied over from the original. It needs to be data=OneT in the caret train() function. The current OneT.train had missing values in an attribute field, not the target, from…
user13248694
1
vote
0 answers
What to do after Stratified K-fold?
I have used the StratifiedKFold to cross validate my training data set. The model has achieved an accuracy of 75% which I have found acceptable.
Should I just continue and implement my model onto the test set:
model.fit(X_train, y_train)
y_pred =…

janqwerty
- 17
- 3
1
vote
0 answers
xgb.cv's auc score is not matching with cross_val_score when `colsample_bytree` is other than 1
I am working on highly imbalanced dataset. During hyperparameter tuning, I found that if colssample_bytree is set to value other than 1, then cross_val_score from sklearn package is not matching with the auc score obtained from xgb.cv.
xgb.cv…

Ravi Prasad
- 11
- 2
1
vote
1 answer
LeavePGroupsOut For multidimensional array
I am working on a research problem and due to a small sized dataset with subjects I am trying to implement Leave N Out style analyses.
Currently I am doing this ad-hoc and I stumbled upon scikit-learn LeavePGroupsOut function.
I read the docs but I…

konsalex
- 425
- 5
- 15
1
vote
0 answers
RandomForestRegressor - K-fold CV cross_val_predict never complete
I'm using RandomForestRegressor to generate new features:
The old script takes 20 mins to complete but still completed...
**param_grid = {
'n_estimators': [10, 50, 100, 1000],
'max_depth' : [4,5,6,7,8],
}
def rfr_model(X, Y):
--Perform…

Katereena
- 11
- 1
1
vote
1 answer
How to Retain The Evaluation Score of kfold using cross_val_score()
I want to understand kfold more clearly and how to choose the best model after it is implemented as a cross-validation method.
According to this source: https://machinelearningmastery.com/k-fold-cross-validation/ the steps to carry out kfold…

Eric Avila Torres
- 43
- 6
1
vote
1 answer
How to do kfold cross-validation for multi-input models
The model is as below:
inputs_1 = keras.Input(shape=(10081,1))
layer1 = Conv1D(64,14)(inputs_1)
layer2 = layers.MaxPool1D(5)(layer1)
layer3 = Conv1D(64, 14)(layer2)
layer4 = layers.GlobalMaxPooling1D()(layer3)
inputs_2 = keras.Input(shape=(85,)) …

nilsinelabore
- 4,143
- 17
- 65
- 122
1
vote
1 answer
How to apply Kfold with TfidfVectorizer?
I'm having an issue in apply K-fold cross-validation with Tfidf. it gives me this error
ValueError: setting an array element with a sequence.
I have seen other questions who had the same problem but they were using train_test_split() It's a little…

Dia Abujaber
- 33
- 1
- 6
1
vote
1 answer
Is it possible to get back the list in stratifiedKFold?
I'd like to do something like this :
Skf = sklearn.model_selection.StratifiedKFold(n_splits = 5, shuffle = True)
ALPHA,BETA = Skf.split(data_X, data_Y)
and then :
for train_index, test_index in ALPHA,BETA
However, it isn't working, why and how…

Marine Galantin
- 1,634
- 1
- 17
- 28