Questions tagged [k-fold]

A technique in cross-validation where the data is partitioned into k subsets (or "folds"), where the first k-1 folds are used for training and the last fold for evaluation. The process is repeated k times, leaving out a different fold for evaluation each time.

284 questions
1
vote
0 answers

ValueError: Unexpected result of `train_function` (Empty logs). Regression with K-Fold model

I am trying out to run this code from François Chollet's Book "Deep Learning with Python" (chapter 3, regression part). from keras.datasets import boston_housing (train_data, train_targets), (test_data, test_targets) =…
nbrc
  • 31
  • 6
1
vote
1 answer

Applying k-folds (stratified 10-fold) to my text classification model

I need help in the following, I have a data frame with Columns: Class (0,1) and text. After cleansing (lemmatizing, removing stopwords, etc), I split the data like the following: #splitting datset from sklearn.model_selection import…
1
vote
1 answer

Where the categorical encoding should be done in a k fold - cv procedure?

I want to apply a cross validation method in my machine learning models. I these models, I want a Feature Selection and a GridSearch to be applied as well. Imagine that I want to estimate the performance of K-Nearest-Neighbor Classifier by applying…
1
vote
0 answers

ValueError: Shape of passed values is (X,y), indices imply (X,y) when saving K-Fold 'test data' into dataframe

I'm trying to save all iteration of my KFold 'test data, class, and predicted result' in one dataframe, but it returns ValueError: Shape of passed values is (1534, 3), indices imply (1, 3). How can I fix this? My code : for train_index, test_index…
Abbi KRK
  • 53
  • 10
1
vote
2 answers

How to compute false positive rate of an imbalanced dataset for Stratified K fold cross validation?

The below lines are the sample code where I am able to compute accuracy, precision, recall, and f1 score. How can I also compute a false positive rate (FPR) for Stratified K fold cross-validation? from sklearn.metrics import make_scorer,…
1
vote
0 answers

lightgbm.basic.LightGBMError: Sum of query counts is not same with #data

I was try to do hyper-parameter tuning using GroupKFold and RandomSearchCV. I have cross checked the shapes, they are matching. How to solve this error? Code: X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.20,…
Python coder
  • 743
  • 5
  • 18
1
vote
0 answers

Access predictions from repeated k-fold cross validation

I want to do a prediction using k-fold cross validation and, in the end, store all the predictions in a file. I am able to do a prediction and get the accuracy, this is how I did it: cv1 = RepeatedKFold(n_splits=10, n_repeats=3,…
1
vote
1 answer

Split k-fold where each fold of validation data doesn't include duplicates

Let's say I have a pandas dataframe df. The df contains 1,000 rows. Like below. print(df) id class 0 0000799a2b2c42d 0 1 00042890562ff68 0 2 0005364cdcb8e5b 0 3 0007a5a46901c56 0 4 …
1
vote
0 answers

Incorrect signal predictions for Sequential Model

I am running a signal processing experiment using python sci-kit learn. I have an elaborate question which spans across multiple steps during K-Fold validation and the consequent predictions. I am describing them below. I have 8640 samples of signal…
Swati Shah
  • 13
  • 5
1
vote
1 answer

GridSearchCV first and then k fold CV or the other way round?

I am having a lot of confusion between GridSearchCV and K fold Cross Validation. I know that GridSearch is only for hyperparameter optimization and K Fold will split my data into K folds and iterate over them (cv value). So should I first split my…
spectre
  • 717
  • 7
  • 21
1
vote
1 answer

Use logistic regression on data set with repeated K fold using R

I am trying to predict if water are safe to drink or not. The data set is composed of the one here: https://www.kaggle.com/adityakadiwal/water-potability?select=water_potability.csv. Assume I take the dataframe to be composed of Ph, Hardness,…
iftach s
  • 25
  • 5
1
vote
1 answer

How to display all 4 splits in a array for Kfolds at n=4?

Each tuple in this list should consist of a train_indices list and a test_indices list containing the training/testing data point indices for that particular K th split. Below is what we want to achieve with the dataset: data_indices =…
Back Buddy
  • 35
  • 7
1
vote
1 answer

How to plot the data and model fit for each fold after kfold cross validation?

I am trying to predict one label variable based on one feature. The two seems to be highly linearly correlated. I chose a linear regression model to describe the data. The output of my code shows R2 score for the training and testing data. My model…
1
vote
0 answers

AttributeError: 'Series' object has no attribute 'lower'?

"this is a code, folds are created but problem is with fit function" "this is a code, folds are created but problem is with fit function" "this is a code, folds are created but problem is with fit function" data =…
1
vote
2 answers

KFold cross validation in Tensorflow

I am trying to implement KFold validation using sklearn and Tensorflow package in Neural Network. my code looks like this. def training(self): n_split = 3 instances = self.instance labels = self.labels for train_index, test_index in…