Questions tagged [cross-validation]

Cross-Validation is a method of evaluating and comparing predictive systems in statistics and machine learning.

Cross-Validation is a statistical method of evaluating and comparing learning algorithms by dividing data into two segments: one used to learn or train a model and the other used to validate the model.

In typical cross-validation, the training and validation sets must cross-over in successive rounds such that each data point has a chance of being validated against. The basic form of cross-validation is k-fold cross-validation.

Other forms of cross-validation are special cases of k-fold cross-validation or involve repeated rounds of k-fold cross-validation.

2604 questions

votes

1 answer

Label a certain x,y data point on a validation curve

valid1 = plot_validation_curve(rand_search.best_estimator_, X_train, y_train, cv=StratifiedKFold(n_splits=5), param_range=np.arange(2,100,2), param_name = 'max_depth', scoring='f1') I…

scikit-learn data-science cross-validation

asked Jun 22 '23 at 16:48

fan-yang

votes

1 answer

Print classification result with k fold classification with sklearn package

I have a dataset that I spilt by the holdout method using sklearn. The following is the procedure from sklearn.model_selection import train_test_split (X_train, X_test, y_train, y_test)=train_test_split(X,y,test_size=0.3, stratify=y) I am using…

pandas scikit-learn classification cross-validation k-fold

asked Jun 22 '23 at 16:01

Encipher

1,370
1
14
31

votes

1 answer

Get the best model after cross validation

How do I get the best model after a training with k-fold cross-validation without grid search? for example: model = XGBClassifier(**best_params) cv_scores = cross_val_score(model, X_train, Y_train, cv=5, scoring='f1') I am not sure how to get the…

python scikit-learn cross-validation

asked Jun 22 '23 at 14:46

Vicky

votes

0 answers

How to do cross validation on a docplex MILP model?

I've created some mixed integer linear programming models for feature selection in classification based on support vector machines. Now I should do cross validation on these models, but I can't figure out how to use the scikit learn library to apply…

machine-learning mathematical-optimization cross-validation linear-programming docplex

asked Jun 22 '23 at 09:06

Frax22

votes

0 answers

HDBScan Random Search Finetuning

Context I am trying to finetuning my hdbscan algorithm from the hdbscan python library using sklearn RandomizedSearchCV. However I am facing the following error : scores = scorer(estimator, X_test) ^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError:…

scikit-learn cross-validation fine-tune hdbscan

asked Jun 19 '23 at 12:55

Mayow

votes

0 answers

Tuning Arguments for CV/ Regression Trees

When I enter: tune_spec<- decision_tree(min_n= tune(), tree_depth= tune(), cost_complexity=tune()) %>% set_engine("rpart") %>% set_mode("regression") tree_grid<- tune_spec %>% extract_parameter_set_dials() %>% …

r random-forest cross-validation hyperparameters

asked Jun 16 '23 at 20:46

Lillian Welsh

votes

0 answers

Surprise NMF object is not callable

I am building a recommender system using the Sushi Preference Dataset and the NMF (Non-negative Matrix Factorization) model. I am implementing the same using the Surprise library. I want to use Randomized Search CV for hyperparameter tuning.…

data-science cross-validation recommendation-engine nmf

asked Jun 16 '23 at 08:31

Sumant Chopde

votes

0 answers

how to do nested cross validation on folds in R using glmnet package

I am trying to generate model using glmnet package in R. I want to do these steps: Randomly split the data into 5 folds. For each fold: a. Remove the fold from the data. b. Use the remaining data to train an elastic-net model using 10-fold…

model cross-validation training-data fold glmnet

asked Jun 15 '23 at 17:02

rheabedi1

votes

0 answers

sklearn cross_val_score always returning same non zero values

I tried using a logistic regression model to predict some data and the first time I use cross_val_score it seems fine. But when I tried to drop some of the less important features and rerun cross_val_score on the limited data it gives the same…

python scikit-learn cross-validation

asked Jun 15 '23 at 04:35

Chris

votes

1 answer

Why is the mean roc score from GridSearchCV using only 1 cv split, different from roc calculated with grid_search.score method or roc_auc_score func?

I was experimenting with sklearn's GridSearchCV, and I don't understand why the mean roc scores I get when using a single split defined with an iterable, are different than what I get running the score method after fitting, or the roc_auc_score…

python scikit-learn cross-validation gridsearchcv auc

asked Jun 13 '23 at 21:04

pepegalleta

votes

2 answers

Understanding the Substantial Performance Discrepancy between Stratified K-Fold Cross Validation and No Cross Validation in my Prediction

: I have developed two versions of my code where one incorporates stratified k-fold cross validation, while the other lacks any form of cross validation. To my surprise, the results achieved using stratified k-fold cross validation significantly…

python classification prediction cross-validation catboost

asked Jun 11 '23 at 09:33

pooya ensafi

votes

0 answers

Performing backward variable selection in R based on test data prediction

How can I apply backwards variable selection based on performance on test data in R? I already know that there is the stepAIC() function which does almost what i want, but in every step it removes one variable based on the AIC criteria. i want to do…

r model data-modeling cross-validation

asked Jun 04 '23 at 13:13

Joshua_ABC

votes

0 answers

Cross validation score/Training score/Test score : what should i considered to say whether a model is a well generalised model?

I am new to Machine learning domain and I want to clear my doubt. My model is a multi class classification model based on smiles notation dataset. And my dataset is less than 1000 rows and also it is an imbalance dataset. Suppose i am getting high…

cross-validation overfitting-underfitting

asked May 31 '23 at 05:58

Sudarshan Murmu

votes

0 answers

How to set optimal number of trees

I'm working with the Boston Housing data set, making models using trees. It's possible to calculate the optimal number of trees using cross-validation, as the last line shows (in this case 8 trees): library(tree) library(MASS) tree.test.RMSE <- 0 df…

r tree cross-validation

asked May 27 '23 at 14:41

Russ Conte

votes

0 answers

How to handle hyperparameter tuning for LSTM with early stopping?

I am looking for advice on the best practice to determine hyperparameters for my LSTM model. I have time series data that I have divided into train and test sets. I was planning to use an expanding walk forward cross validation scheme on my train…

validation neural-network lstm cross-validation

asked May 26 '23 at 19:02

Merry

Prev 1 2 3

…

99 100 Next