Questions tagged [cross-validation]

Cross-Validation is a method of evaluating and comparing predictive systems in statistics and machine learning.

Cross-Validation is a statistical method of evaluating and comparing learning algorithms by dividing data into two segments: one used to learn or train a model and the other used to validate the model.

In typical cross-validation, the training and validation sets must cross-over in successive rounds such that each data point has a chance of being validated against. The basic form of cross-validation is k-fold cross-validation.

Other forms of cross-validation are special cases of k-fold cross-validation or involve repeated rounds of k-fold cross-validation.

2604 questions

votes

0 answers

Creating a rolling window forecast in r

I need your help understanding rolling window and expanding window forecasting strategy in r . I am using inflation data from Thailand from between January 2003 and December 2014. My problem is as follows: A) I wish to conduct an out of sample…

time-series cross-validation forecasting

asked Apr 11 '23 at 14:33

Lewis Wandaka

votes

0 answers

Retrain model after CrossValidation

So, as can be seen here, here and here, we should retrain our model using the whole dataset after we are satisfied with our CV results. Check the following code to train a Random Forest: from sklearn.ensemble import RandomForestClassifier from…

python scikit-learn cross-validation

asked Apr 11 '23 at 13:37

Murilo

votes

0 answers

Evaluate multiple metrics in a single GridSearchCV with scikit-survival

Currently, I am doing a simulation to compare multiple models, my study doesn't require the best_estimator_ only the results from cv_results_. The problem that I have is that I need the integrated_brier_score and cumulative_dynamic_auc for each…

python scikit-learn cross-validation survival-analysis scikit-survival

asked Apr 08 '23 at 04:01

carlos

votes

0 answers

Sklearn Supported target types are: ('binary', 'multiclass'). Got 'multilabel-indicator' instead

I have a multiclass problem (12 classes) and i am using LabelBinarizer to "one-hot-encode" my output. I was trying to use the LabelBinarizer first, then split the data using StratifiedKfold, but got that error: Supported target types are: ('binary',…

python scikit-learn cross-validation

asked Apr 07 '23 at 02:00

Murilo

votes

1 answer

Cross-validation logistic regression returns very different accuracies

I'm running cross validation on logistic regression, and I've run into a strange issue where the train and test accuracy are all 100% except for the very first and second fold, which are about 66% accuracy. 100% accuracy is definitely wrong and I am…

scikit-learn logistic-regression cross-validation

asked Apr 06 '23 at 06:12

pistachiodesserts

votes

0 answers

How to do Cross Validation on Neural Networks with multiple binary classification outputs?

I am trying to use StratifiedKFold to do cross validation on my CNN that outputs multiple binary classifications. However, StratifiedKFold are unable to process multi label indicators. skf = StratifiedKFold(n_splits=10, shuffle=True,…

python keras scikit-learn deep-learning cross-validation

asked Apr 05 '23 at 13:02

Albert

votes

0 answers

How do I calculate R-squared from LightGBM cross-validated models in R?

I'm trying to run LightGBM with 5-fold cross-validation to predict the first 123 PCs of a plasma metabolite principal component analysis. I'd like to get the R-squared for the best iteration for each outcome, but can't find a direct way to extract…

r cross-validation variance lightgbm mse

asked Apr 03 '23 at 22:47

Robin B

votes

0 answers

How can I draw a dynamic graph in Python?

I just started using Python. I would like to plot a dynamic graph that shows me the performance (in terms of accuracy) of a kNN algorithm obtained with n-fold cross-validation. I would like to get a graph where: x = k nearest neighbours y = average…

python graph dynamic cross-validation knn

asked Apr 03 '23 at 10:46

Federico Andreoli

votes

0 answers

CatBoostError: Bad value for num_feature

I have code: from sklearn.model_selection import KFold, cross_val_predict from catboost import Pool, CatBoostRegressor, cv from sklearn.metrics import mean_squared_error import matplotlib.pyplot as plt import pandas as pd import numpy as np import…

python machine-learning jupyter-notebook cross-validation catboost

asked Apr 02 '23 at 21:39

Nikolay

votes

0 answers

DataBricks PySpark error when attempting to fit CrossValidator object has been asked before

I am facing the exact same issue as enter link description here the problem is when calling cross-validation in databricks am getting a weird error just like the one mentioned in the link. Can someone please help.

apache-spark-sql classification databricks cross-validation

asked Mar 30 '23 at 19:56

FalconX

votes

0 answers

How can I train a caret model with time slices while holding out one or more groups for each cross-validation fold?

I'm trying to train a model on a panel of different units over time. I understand how to use createTimeSlices from the caret package, but I'd like to use this same process while simultaneously holding out different units in different training folds.…

r cross-validation r-caret

asked Mar 28 '23 at 20:44

broodoots

votes

0 answers

Using SMOTE with imblearn pipline and crossvalidation

this more of a theoretical questions, but i am dealing with a pretty imbalanced dataset. Therefore I want to use SMOTE to rebalance the data in order to achieve better results with my models. Now I read that to avoid Data Leakage only the training…

python cross-validation imblearn smote

asked Mar 26 '23 at 21:10

wihee

votes

0 answers

Nested cross validation to XGBoost and Random Forest models

The inner fold and outer fold don't seem to be correct. I am not sure if I am using the training and testing datasets properly. Any help is welcome :) ... # Scale the data scaler = StandardScaler() X_scaled = scaler.fit_transform(X) # Set the outer…

python python-3.x python-2.7 cross-validation

asked Mar 24 '23 at 23:53

cabral279

votes

1 answer

How to suppress warning messages when lightgbm is used?

I am using lightgbm to train LGBM models in R. However, whenever I call lgb.cv() function, lots of warning messages came out. My code is written as: train_params <- list(objective = "binary", learning_rate = 0.2, num_leaves = 50L, …

r cross-validation lightgbm

asked Mar 22 '23 at 00:42

Phoebe

votes

1 answer

Marginal R2 for linear mixed models in cross-validation

For a prediction problem I am working on I want to calculate the variance in the data explained by the linear effects of my linear mixed model. To evaluate my predictive performance I plan on using five-fold cross-validation. The common approach to…

r prediction cross-validation mixed-models variance

asked Mar 21 '23 at 09:01

Pieter van der Veere

Prev 1 2 3

…

99 100 Next