A technique in cross-validation where the data is partitioned into k subsets (or "folds"), where the first k-1 folds are used for training and the last fold for evaluation. The process is repeated k times, leaving out a different fold for evaluation each time.
Questions tagged [k-fold]
284 questions
0
votes
1 answer
Kfold cross validation in python
What im trying to do;
Get the K-fold cross validated scores of an SVM. The data has all numerical independent variables, and a categorical dependent variable. Im using python3, sklearn and feature engine.
My understanding on the matter;
The…

Akshay Ram
- 25
- 9
0
votes
2 answers
Creating K dataframe using train_index, test_index of Kfold cross validation in Python using sklearn.cross_validation.KFold()
I am using 5 fold cross validation in python using sklearn.cross_validation.KFold() to see how my model performs. It is performing well on 4 folds and very poor performance on one specific fold. As i am new to the Data Science I was wondering how i…
0
votes
1 answer
How to understand a function which splits the data
Can someone help me understanding what this function does?
I understand up to the line print but after that I'm a bit lost. Starting from train_data.
def stratifiedShuffleSplit_data(X, y):
sss = StratifiedShuffleSplit(n_splits=5, test_size=0.5,…

EMMAKENJI
- 359
- 2
- 5
- 14
0
votes
0 answers
How to find a specific point on a model in R?
I am working with a CSV dataset called combinedDataset, which I found a model for using k-fold validation procedure. My x value for the model is hour meter reading while my y value is cumulative cost. Here's a dput of my…

Amsi
- 11
- 2
0
votes
0 answers
Trouble finding the MSE value during K-fold cross validation procedure
I am currently doing a K-fold cross validation procedure to determine the best model (linear or quadratic) for this data is. My data comes from a CSV dataset called combinedData which I've pasted a dput for below:
structure(list(Unit.ID = c(925L,…

Amsi
- 11
- 2
0
votes
1 answer
Different RMSE from cross_validate and iterating Kfolds
I want to write my own function for a cross validation as I cant use cross_validate in this situation.
Corret me if I am wrong but my cross validate code is:
cv = cross_validate(elastic.est,X,y,cv=5,scoring='neg_mean_squared_error')
output …

Lewis Morris
- 1,916
- 2
- 28
- 39
0
votes
0 answers
How to create a column in a Dataframe that calculates cross_val_score based on values from another column
I created a DataFrame (df_kfolds) with 2 columns: kfolds & Mean_Score where Kfolds has values ranging from 3 to 5. I'm trying to calculate the mean_score of each kfold derived from the following:
cross_val_score(lr, X, y, cv=3, error_score =…
0
votes
2 answers
Splitting a data set for K-fold Cross Validation in Sci-Kit Learn
I was assigned a task that requires creating a Decision Tree Classifier and determining the accuracy rates using the training set and 10-fold cross-validation. I went over the documentation for cross_val_predict as I believe that this is the module…

InNeedOfaName
- 13
- 1
- 7
0
votes
1 answer
10 fold cross validation python
There is a deep learning based model using Transfer Learning and LSTM in this article, that author used 10 fold cross validation (as explained in table 3) and took the average of results.
I am familiar with 10 fold cross validation as we need to…

Zahra Hnn
- 149
- 1
- 3
- 13
0
votes
0 answers
getting "Error in xj[i] : only 0's may be mixed with negative subscripts" when performing polynomial regression in for loop
I am trying to use a for-loop to determine the optimal polynomial degrees to use for each variable in my regression, and will then use k-fold cross-validation. I am getting an error "Error in xj[i] : only 0's may be mixed with negative subscripts".…

bgaerber
- 76
- 7
0
votes
1 answer
Low F1-score for the first few Folds
I created a classification model using Random forest. To validate the model i am using K-Fold method with 10 splits and measuring model performance by f1-score. when i perform this i am having very less f1-score for the first few folds and very high…

LUZO
- 1,019
- 4
- 19
- 42
0
votes
1 answer
k-fold cross validation to tune regressive tree model using pyspark
I'm trying to use k-fold cross-validation to tune a regressive tree generated in pyspark. However, from what I've seen so far, it is not possible to combine pyspark's CrossValidator with pyspark's DecisionTree.trainRegressor. Here is the relevant…

wookieluvr13
- 103
- 1
- 6
0
votes
1 answer
How to prevent one fold to perform a lot worse than the other 9 in 10-fold cross validation for CNN classification
I'm currently working on a 2D CNN in Keras for MRI classification. The class ratio is about 60/40, I have 155 patients, each with one MRI consisting of around 180 slices, the input of the CNN is a slice of an MRI image (256*256 px) (so input in…

Sinraw
- 1
- 1
0
votes
0 answers
Do I need to create a new classifier for each fold in K-Fold Cross Validation?
I am trying to train a classifier to detect imperatives.
There are 2000 imperatives and 2000 non-imperatives in my data.
I used 10% of 4000 (400) to be my Test set, and the rest of 3600 sentences as Training set for the classifiers.
I tried to…

Zong-Ying
- 1
- 1
0
votes
2 answers
Select sample from train data based on fold from k-fold cross-validation
I have performed the k-fold cross-validation without package based on here How to split a data set to do 10-fold cross validation using no packages
I need to select 30% of the sample from each fold in train data. Here is my function:
samples = 300
r…

Norin
- 1
- 3