Questions tagged [cross-validation]

Cross-Validation is a method of evaluating and comparing predictive systems in statistics and machine learning.

Cross-Validation is a statistical method of evaluating and comparing learning algorithms by dividing data into two segments: one used to learn or train a model and the other used to validate the model.

In typical cross-validation, the training and validation sets must cross-over in successive rounds such that each data point has a chance of being validated against. The basic form of cross-validation is k-fold cross-validation.

Other forms of cross-validation are special cases of k-fold cross-validation or involve repeated rounds of k-fold cross-validation.

2604 questions

votes

0 answers

How to use learning curves and cross-validation?

My aim is to prove whether there is overfitting or underfitting. However, when I calculate the learning curves (graphically depict how a process is improved), the standard deviation of the cross-validation score is enormous. My observation here is…

asked Feb 24 '23 at 09:06

vdu16

votes

0 answers

How to plot the ROC curve for the mean fold of each class in a multiclass classification

I evaluate the performance of a random forest using 5 cross-validations on a multiclass classification. The curve that I get is like the picture enter image description here the code i use is as follows cv=StratifiedKFold(n_splits=5) classifier =…

cross-validation roc multiclass-classification

asked Feb 24 '23 at 04:03

2733_Trias Handayani

votes

0 answers

Confusing results when running the quanteda.classifiers::crossval function

I have been trying to use the following code to run the integrated quanteda crossval function. The code works but the results look really strange to me in the sense that they differ a lot from what I receive when I implement a cross-validation loop…

r cross-validation quanteda

asked Feb 23 '23 at 14:31

Max Overbeck

votes

0 answers

How to use the commonly used wrapper for models from statsmodels to apply cross-validation?

I read the relevant discussion here: Using statsmodel estimations with scikit-learn cross validation, is it possible? In the discussion from the link it is advised to use a wrapper for models from statsmodels such that the cross_val_score function…

python scikit-learn statsmodels cross-validation glm

asked Feb 22 '23 at 16:42

Xtiaan

votes

0 answers

Whats the correct way to format X and Y from binnary dataframe to use on Stratified K-Fold cross-validation

My data is an dataframe of a table of 25 columns and 2737 rows containg binnary data. The goal is to train using each row as an INPUT and get as an OUTPUT a probabilistic prediction of what the next sequence could be. Data on this scenario is always…

python keras binary cross-validation k-fold

asked Feb 20 '23 at 14:32

Wisdom

votes

0 answers

Getting unnacurate number of rows when using predict function in a cross validation excercise

I'm Performing a K-fold exercise with K = 10 for polinomials from degree 1 to 5 with the purpose of identifying which polynomial fits the best the data provided. Never the less, when I try to predict Y-Hat using the testing data (X-test) which…

r cross-validation k-fold

asked Feb 19 '23 at 19:51

Lucpi

votes

0 answers

Custom cross-validation for Ridge in sklearn

I have written the following algorithm to implement a Ridge regression and estimate its parameter via cross validation. In particular, I wanted to achieve the following: For the purpose of cross-validation, the train set is divided into 10 folds.…

scikit-learn pipeline cross-validation

asked Feb 18 '23 at 20:38

NC520

votes

0 answers

Bootstrapping the uncertainty on an RMSE estimate of a location-scale generalized additive model

I have height data (numeric height data in cm; Height) of plants measured over time (numeric data expressed in days of the year; Doy). These data is grouped per genotype (factor data; Genotype) and individual plant (Factor data; Individual). I've…

loops cross-validation bootstrapping gam mgcv

asked Feb 17 '23 at 13:25

Bertold Mariën

votes

0 answers

Unexpected behaviour (inflated results on random-data) in scikit-learn with nested cross-validation

When trying to train/evaluate a support vector machine in scikit-learn, I am experiencing some unexpected behaviour and I am wondering whether I am doing something wrong or that this is a possible bug. In a very specific subset of circumstances,…

machine-learning scikit-learn svm cross-validation

asked Feb 17 '23 at 09:29

user50466

votes

0 answers

About Sklearn double cross validation with wrapper feature_selection

About Double-CV or Nested-CV. The simplest example would be from sklearn.model_selection import cross_val_score, GridSearchCV from sklearn.ensemble import RandomForestRegressor from sklearn.pipeline import Pipeline gcv =…

scikit-learn cross-validation feature-selection

asked Feb 17 '23 at 01:31

x H

votes

0 answers

Implemention of early stopping with gradient descent

I am developing an algorithm based on gradient descent and I would like to add early stoping regularization. I have an objectif function,F, and I minimize it with respect to W. This is given in the code below: Data : X_Train, Y_Train t=1; while (t…

optimization cross-validation gradient-descent early-stopping

asked Feb 15 '23 at 15:07

ZchGarinch

votes

0 answers

Troubles with Cross-Validation

I have some troubles to implement cross-validation. I understand that after cross-validation I have to re-train the model but I have the next doubts: Do train_test split before cross validation and use X_train and y_train for cross-validation…

machine-learning data-science cross-validation

asked Feb 10 '23 at 17:39

Erik Carcelén

votes

1 answer

Error in newdata[, object$model.list$variables] : subscript out of bounds"

When I am running this code, I am getting this error "Error in newdata[, object$model.list$variables] : subscript out of bounds" I am not getting how to solve…

r cross-validation

asked Feb 10 '23 at 17:25

linta

votes

0 answers

Plot training metrics from multiple cross validation folds in tensorflow

I'm closely following code from this tutorial for my data and its trianing nicely: https://www.tensorflow.org/tutorials/structured_data/imbalanced_data#class_weights The only key difference I've made (other than dataset) is that I perform k-fold…

python tensorflow matplotlib deep-learning cross-validation

asked Feb 07 '23 at 11:22

TheyTakingTheHobbitsToIsengard

votes

1 answer

scikit-learn cross_validate: reveal test set indices

In sklearn.model_selection.cross_validate , is there a way to output the samples / indices which were used as test set by the CV splitter for each fold?

scikit-learn cross-validation

asked Feb 07 '23 at 02:28

roble

Prev 1 2 3

…

100