Questions tagged [k-fold]

A technique in cross-validation where the data is partitioned into k subsets (or "folds"), where the first k-1 folds are used for training and the last fold for evaluation. The process is repeated k times, leaving out a different fold for evaluation each time.

284 questions
0
votes
1 answer

Kfold cross validation in python

What im trying to do; Get the K-fold cross validated scores of an SVM. The data has all numerical independent variables, and a categorical dependent variable. Im using python3, sklearn and feature engine. My understanding on the matter; The…
Akshay Ram
  • 25
  • 9
0
votes
2 answers

Creating K dataframe using train_index, test_index of Kfold cross validation in Python using sklearn.cross_validation.KFold()

I am using 5 fold cross validation in python using sklearn.cross_validation.KFold() to see how my model performs. It is performing well on 4 folds and very poor performance on one specific fold. As i am new to the Data Science I was wondering how i…
0
votes
1 answer

How to understand a function which splits the data

Can someone help me understanding what this function does? I understand up to the line print but after that I'm a bit lost. Starting from train_data. def stratifiedShuffleSplit_data(X, y): sss = StratifiedShuffleSplit(n_splits=5, test_size=0.5,…
EMMAKENJI
  • 359
  • 2
  • 5
  • 14
0
votes
0 answers

How to find a specific point on a model in R?

I am working with a CSV dataset called combinedDataset, which I found a model for using k-fold validation procedure. My x value for the model is hour meter reading while my y value is cumulative cost. Here's a dput of my…
Amsi
  • 11
  • 2
0
votes
0 answers

Trouble finding the MSE value during K-fold cross validation procedure

I am currently doing a K-fold cross validation procedure to determine the best model (linear or quadratic) for this data is. My data comes from a CSV dataset called combinedData which I've pasted a dput for below: structure(list(Unit.ID = c(925L,…
Amsi
  • 11
  • 2
0
votes
1 answer

Different RMSE from cross_validate and iterating Kfolds

I want to write my own function for a cross validation as I cant use cross_validate in this situation. Corret me if I am wrong but my cross validate code is: cv = cross_validate(elastic.est,X,y,cv=5,scoring='neg_mean_squared_error') output …
Lewis Morris
  • 1,916
  • 2
  • 28
  • 39
0
votes
0 answers

How to create a column in a Dataframe that calculates cross_val_score based on values from another column

I created a DataFrame (df_kfolds) with 2 columns: kfolds & Mean_Score where Kfolds has values ranging from 3 to 5. I'm trying to calculate the mean_score of each kfold derived from the following: cross_val_score(lr, X, y, cv=3, error_score =…
0
votes
2 answers

Splitting a data set for K-fold Cross Validation in Sci-Kit Learn

I was assigned a task that requires creating a Decision Tree Classifier and determining the accuracy rates using the training set and 10-fold cross-validation. I went over the documentation for cross_val_predict as I believe that this is the module…
0
votes
1 answer

10 fold cross validation python

There is a deep learning based model using Transfer Learning and LSTM in this article, that author used 10 fold cross validation (as explained in table 3) and took the average of results. I am familiar with 10 fold cross validation as we need to…
0
votes
0 answers

getting "Error in xj[i] : only 0's may be mixed with negative subscripts" when performing polynomial regression in for loop

I am trying to use a for-loop to determine the optimal polynomial degrees to use for each variable in my regression, and will then use k-fold cross-validation. I am getting an error "Error in xj[i] : only 0's may be mixed with negative subscripts".…
bgaerber
  • 76
  • 7
0
votes
1 answer

Low F1-score for the first few Folds

I created a classification model using Random forest. To validate the model i am using K-Fold method with 10 splits and measuring model performance by f1-score. when i perform this i am having very less f1-score for the first few folds and very high…
LUZO
  • 1,019
  • 4
  • 19
  • 42
0
votes
1 answer

k-fold cross validation to tune regressive tree model using pyspark

I'm trying to use k-fold cross-validation to tune a regressive tree generated in pyspark. However, from what I've seen so far, it is not possible to combine pyspark's CrossValidator with pyspark's DecisionTree.trainRegressor. Here is the relevant…
wookieluvr13
  • 103
  • 1
  • 6
0
votes
1 answer

How to prevent one fold to perform a lot worse than the other 9 in 10-fold cross validation for CNN classification

I'm currently working on a 2D CNN in Keras for MRI classification. The class ratio is about 60/40, I have 155 patients, each with one MRI consisting of around 180 slices, the input of the CNN is a slice of an MRI image (256*256 px) (so input in…
0
votes
0 answers

Do I need to create a new classifier for each fold in K-Fold Cross Validation?

I am trying to train a classifier to detect imperatives. There are 2000 imperatives and 2000 non-imperatives in my data. I used 10% of 4000 (400) to be my Test set, and the rest of 3600 sentences as Training set for the classifiers. I tried to…
0
votes
2 answers

Select sample from train data based on fold from k-fold cross-validation

I have performed the k-fold cross-validation without package based on here How to split a data set to do 10-fold cross validation using no packages I need to select 30% of the sample from each fold in train data. Here is my function: samples = 300 r…
Norin
  • 1
  • 3
1 2 3
18
19