Questions tagged [k-fold]

A technique in cross-validation where the data is partitioned into k subsets (or "folds"), where the first k-1 folds are used for training and the last fold for evaluation. The process is repeated k times, leaving out a different fold for evaluation each time.

284 questions
0
votes
1 answer

How to create a k-fold cross validation test?

I have a data from a pollution sensor that I wish to validate. I am comparing it to data from londonair.org.uk to compare it. I have created a simple linear regression model with my sensor data on the X-axis and the Londonair data on the Y axis, and…
0
votes
1 answer

the way to do the cross validation

Let say I have fold1, fold2 , fold3. I trained fold1,fold2,fold3 with modelA. A) modelA(fold1) -> modelA(fold2) -> modelA(fold3) B) modelA(fold1) -> saved weight modelA(fold1) -> modelA(fold2)-> saved weight modelA(fold2) -> modelA(fold3)->…
user11240811
0
votes
0 answers

How to make sure that my dataset is equally distributed among the classes, i.e., it is stratified, size and class distribution should be balanced?

I have made a simple K-fold cross validation code, now I want to do some modification such that it is balanced in size and class distribution?. P.S: I need to use python code from scratch, sklearn is not allowed. from random import seed from random…
-1
votes
1 answer

How to identify outliers and drop rows in train splits of each fold, when using StratifiedKFold in GridSearchCV?

For predicting whether a subject has liver disease or not, I'm using StratifiedKFold CV in GridSearch for AdaBoost and RandomForest Classsifiers. For Outlier anlaysis, I've identified all feature outliers and extracted their row indices on full…
-1
votes
2 answers

How to use StratifiedKFold with wandb sweeps?

I have the following piece of code - it is a train function for Logistic regression. I run sweeps to be able to compare hyperparameter tuning results. My issue is that I don't know how to incorporate StratifiedKFold to work with sweeps. I would…
Yana
  • 785
  • 8
  • 23
-1
votes
1 answer

Repeated holdout method

How can I make "Repeated" holdout method, I made holdout method and get accuracy but need to repeat holdout method for 30 times There is my code for holdout method [IN] X_train, X_test, Y_train, Y_test = train_test_split(X, Y.values.ravel(),…
raideR49
  • 79
  • 6
-1
votes
1 answer

K-Fold cross validation for Lasso and Ridge models

I'm working with the Boston housing dataset from sklearn.datasets and have run ridge and lasso regressions on my data (post train/test split). I'm now trying to perform k-fold cross validation to find the optimal penalty parameters, and have written…
econ32
  • 13
  • 3
-1
votes
1 answer

How can I get data after cross-validation?

I'm trying to make Image Classifier for 7 classes using transfer learning with Xception. and now I'm trying to implement cross-validation. I know KFold return indices but how can I get the data value. from sklearn.model_selection import KFold import…
-1
votes
1 answer

What values to use on a neural network analysis?

I have the following exercise: create a neural network using the k-fold cross validation. Evaluate the performance for different configurations. After this, i should compare the values with the values obtained using the decision tree model, for…
Diogo Soares
  • 130
  • 2
  • 8
-1
votes
1 answer

k-fold cross validation with MLP algorithm

I have a dataset that is divided into training and test parts. My task is to train it and evaluate my model using k-fold cross validation. I'm a bit confused with the task statement. As far as I know the point of k-dold cross validation is to…
Kosh
  • 960
  • 2
  • 13
  • 28
-1
votes
1 answer

assessing accuracy within k-fold cross validation versus hold out data

I have observed the mean accuracy after applying Stratifiedkfold is higher in comparison with the accuracy on the holdout data. I wonder if this can be a sign of over fitting in this case and if so, can someone explain. Yhe accuracy on the holdout…
-2
votes
1 answer

when I run GridSearchCV() classifier with parameters so i get this kind of error:-ValueError: could not convert string to float: 'text'

so please how can I resolve this kind of error, anyone's guide me please X = df.iloc[:,:-2] y = df.My_Labels from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score,confusion_matrix,classification_report from…
-2
votes
2 answers

Tuned model with GroupKFold Cross-Validaion requires Group parameter when Predicting

I tuned a RandomForest with GroupKFold (to prevent data leakage because some rows came from the same group). I get a best fit model, but when I go to make a prediction on the test data it says that it needs the group feature. Does that make sense?…
-2
votes
1 answer

For loop problem in python for machine Learning

Hi guys I am trying to perform K-Fold cross validation on this insurance dataset but I trying to use a for loop to iterate over an array of integers. The output gives me the following error : ValueError: The number of folds must be of Integral…
Murad24
  • 1
  • 2
1 2 3
18
19