A technique in cross-validation where the data is partitioned into k subsets (or "folds"), where the first k-1 folds are used for training and the last fold for evaluation. The process is repeated k times, leaving out a different fold for evaluation each time.
Questions tagged [k-fold]
284 questions
1
vote
0 answers
ValueError: Unexpected result of `train_function` (Empty logs). Regression with K-Fold model
I am trying out to run this code from François Chollet's Book "Deep Learning with Python" (chapter 3, regression part).
from keras.datasets import boston_housing
(train_data, train_targets), (test_data, test_targets) =…

nbrc
- 31
- 6
1
vote
1 answer
Applying k-folds (stratified 10-fold) to my text classification model
I need help in the following, I have a data frame with Columns: Class (0,1) and text.
After cleansing (lemmatizing, removing stopwords, etc), I split the data like the following:
#splitting datset
from sklearn.model_selection import…

Firas K.
- 11
- 2
1
vote
1 answer
Where the categorical encoding should be done in a k fold - cv procedure?
I want to apply a cross validation method in my machine learning models. I these models, I want a Feature Selection and a GridSearch to be applied as well. Imagine that I want to estimate the performance of K-Nearest-Neighbor Classifier by applying…

asm04
- 11
- 3
1
vote
0 answers
ValueError: Shape of passed values is (X,y), indices imply (X,y) when saving K-Fold 'test data' into dataframe
I'm trying to save all iteration of my KFold 'test data, class, and predicted result' in one dataframe, but it returns ValueError: Shape of passed values is (1534, 3), indices imply (1, 3). How can I fix this?
My code :
for train_index, test_index…

Abbi KRK
- 53
- 10
1
vote
2 answers
How to compute false positive rate of an imbalanced dataset for Stratified K fold cross validation?
The below lines are the sample code where I am able to compute accuracy, precision, recall, and f1 score. How can I also compute a false positive rate (FPR) for Stratified K fold cross-validation?
from sklearn.metrics import make_scorer,…

Arun Kumar Dey
- 47
- 1
- 6
1
vote
0 answers
lightgbm.basic.LightGBMError: Sum of query counts is not same with #data
I was try to do hyper-parameter tuning using GroupKFold and RandomSearchCV. I have cross checked the shapes, they are matching. How to solve this error?
Code:
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.20,…

Python coder
- 743
- 5
- 18
1
vote
0 answers
Access predictions from repeated k-fold cross validation
I want to do a prediction using k-fold cross validation and, in the end, store all the predictions in a file.
I am able to do a prediction and get the accuracy, this is how I did it:
cv1 = RepeatedKFold(n_splits=10, n_repeats=3,…

RToPython
- 11
- 1
1
vote
1 answer
Split k-fold where each fold of validation data doesn't include duplicates
Let's say I have a pandas dataframe df. The df contains 1,000 rows. Like below.
print(df)
id class
0 0000799a2b2c42d 0
1 00042890562ff68 0
2 0005364cdcb8e5b 0
3 0007a5a46901c56 0
4 …

Dream Aerwyna
- 23
- 4
1
vote
0 answers
Incorrect signal predictions for Sequential Model
I am running a signal processing experiment using python sci-kit learn. I have an elaborate question which spans across multiple steps during K-Fold validation and the consequent predictions. I am describing them below.
I have 8640 samples of signal…

Swati Shah
- 13
- 5
1
vote
1 answer
GridSearchCV first and then k fold CV or the other way round?
I am having a lot of confusion between GridSearchCV and K fold Cross Validation. I know that GridSearch is only for hyperparameter optimization and K Fold will split my data into K folds and iterate over them (cv value). So should I first split my…

spectre
- 717
- 7
- 21
1
vote
1 answer
Use logistic regression on data set with repeated K fold using R
I am trying to predict if water are safe to drink or not. The data set is composed of the one here:
https://www.kaggle.com/adityakadiwal/water-potability?select=water_potability.csv.
Assume I take the dataframe to be composed of Ph, Hardness,…

iftach s
- 25
- 5
1
vote
1 answer
How to display all 4 splits in a array for Kfolds at n=4?
Each tuple in this list should consist of a train_indices list and a test_indices list containing the training/testing data point indices for that particular K th split.
Below is what we want to achieve with the dataset:
data_indices =…

Back Buddy
- 35
- 7
1
vote
1 answer
How to plot the data and model fit for each fold after kfold cross validation?
I am trying to predict one label variable based on one feature. The two seems to be highly linearly correlated. I chose a linear regression model to describe the data. The output of my code shows R2 score for the training and testing data. My model…

the phoenix
- 641
- 7
- 15
1
vote
0 answers
AttributeError: 'Series' object has no attribute 'lower'?
"this is a code, folds are created but problem is with fit function"
"this is a code, folds are created but problem is with fit function"
"this is a code, folds are created but problem is with fit function"
data =…

Zohaib Arshid
- 23
- 4
1
vote
2 answers
KFold cross validation in Tensorflow
I am trying to implement KFold validation using sklearn and Tensorflow package in Neural Network.
my code looks like this.
def training(self):
n_split = 3
instances = self.instance
labels = self.labels
for train_index, test_index in…

Raj Rajeshwari Prasad
- 304
- 2
- 17