Questions tagged [cross-validation]

Cross-Validation is a method of evaluating and comparing predictive systems in statistics and machine learning.

Cross-Validation is a statistical method of evaluating and comparing learning algorithms by dividing data into two segments: one used to learn or train a model and the other used to validate the model.

In typical cross-validation, the training and validation sets must cross-over in successive rounds such that each data point has a chance of being validated against. The basic form of cross-validation is k-fold cross-validation.

Other forms of cross-validation are special cases of k-fold cross-validation or involve repeated rounds of k-fold cross-validation.

2604 questions

votes

3 answers

Classification report with Nested Cross Validation in SKlearn (Average/Individual values)

Is it possible to get classification report from cross_val_score through some workaround? I'm using nested cross-validation and I can get various scores here for a model, however, I would like to see the classification report of the outer loop. Any…

asked Mar 02 '17 at 17:33

utengr

3,225
3
29
68

votes

3 answers

How to plot a learning curve for a keras experiment?

I'm training an RNN using keras and would like to see how the validation accuracy changes with the data set size. Keras has a list called val_acc in its history object which gets appended after every epoch with the respective validation set accuracy…

machine-learning neural-network cross-validation keras recurrent-neural-network

asked Jun 06 '16 at 18:48

akilat90

5,436
7
28
42

votes

2 answers

Difference between using train_test_split and cross_val_score in sklearn.cross_validation

I have a matrix with 20 columns. The last column are 0/1 labels. The link to the data is here. I am trying to run random forest on the dataset, using cross validation. I use two methods of doing this: using…

python scikit-learn cross-validation

asked May 21 '15 at 03:51

evianpring

3,316
1
25
54

votes

2 answers

Cross validation for glm() models

I'm trying to do a 10-fold cross validation for some glm models that I have built earlier in R. I'm a little confused about the cv.glm() function in the boot package, although I've read a lot of help files. When I provide the following…

r partitioning prediction glm cross-validation

asked Jan 27 '14 at 11:56

Error404

6,959
16
45
58

votes

1 answer

key error not in index while cross validation

I have applied svm on my dataset. my dataset is multi-label means each observation has more than one label. while KFold cross-validation it raises an error not in index. It shows the index from 601 to 6007 not in index (I have 1...6008 data…

python scikit-learn cross-validation

asked Aug 15 '18 at 03:29

sariii

2,020
6
29
57

votes

1 answer

Do I use the same Tfidf vocabulary in k-fold cross_validation

I am doing text classification based on TF-IDF Vector Space Model.I have only no more than 3000 samples.For the fair evaluation, I'm evaluating the classifier using 5-fold cross-validation.But what confuses me is that whether it is necessary to…

python scikit-learn cross-validation tf-idf

asked Sep 02 '17 at 04:57

lx.F

votes

3 answers

Scikit-learn, GroupKFold with shuffling groups?

I was using StratifiedKFold from scikit-learn, but now I need to watch also for "groups". There is nice function GroupKFold, but my data are very time dependent. So similary as in help, ie number of week is the grouping index. But each week should…

python scikit-learn shuffle cross-validation

asked Nov 26 '16 at 14:52

gugatr0n1c

votes

1 answer

How to nest LabelKFold?

I have a dataset with ~300 points and 32 distinct labels and I want to evaluate a LinearSVR model by plotting its learning curve using grid search and LabelKFold validation. The code I have looks like this: import numpy as np from sklearn import…

python scikit-learn cross-validation

asked Jun 25 '16 at 00:11

Alex

votes

3 answers

How to Plot PR-Curve Over 10 folds of Cross Validation in Scikit-Learn

I'm running some supervised experiments for a binary prediction problem. I'm using 10-fold cross validation to evaluate performance in terms of mean average precision (average precision for each fold divided by the number of folds for cross…

python plot machine-learning scikit-learn cross-validation

asked Apr 15 '15 at 17:11

kylerthecreator

1,509
3
15
32

votes

2 answers

Alternate different models in Pipeline for GridSearchCV

I want to build a Pipeline in sklearn and test different models using GridSearchCV. Just an example (please do not pay attention on what particular models are chosen): reg = LogisticRegression() proj1 = PCA(n_components=2) proj2 = MDS() proj3 =…

python scikit-learn pipeline cross-validation grid-search

asked May 10 '18 at 05:21

sooobus

votes

1 answer

Validation and Testing accuracy widely different

I am currently working on a dataset in kaggle. After training the model of the training data, I testing it on the validation data and got an accuracy of around 0.49. However, the same model gives an accuracy of 0.05 on the testing data. I am using…

machine-learning deep-learning cross-validation training-data kaggle

asked Feb 10 '18 at 08:13

user3828311

votes

2 answers

Scikit-Learn: Avoiding Data Leakage During Cross-Validation

I've just been reading up on k-fold cross-validation and have realized that I'm inadvertently leaking data with my current preprocessing setup. Usually, I have a train and test dataset. I do a bunch of data imputation and one-hot encoding on my…

scikit-learn pipeline cross-validation

asked Jan 28 '18 at 01:58

anon_swe

8,791
24
85
145

votes

1 answer

K fold cross validation using keras

It seems that k-fold cross validation in convn net is not taken seriously due to huge running time of the neural network. I have a small data-set and I am interested in doing k-fold cross validation using the example given here. Is it possible?…

keras cross-validation

asked Dec 19 '16 at 01:03

motiur

1,640
9
33
61

votes

1 answer

Evaluating Logistic regression with cross validation

I would like to use cross validation to test/train my dataset and evaluate the performance of the logistic regression model on the entire dataset and not only on the test set (e.g. 25%). These concepts are totally new to me and am not very sure if…

python scikit-learn logistic-regression cross-validation

asked Aug 26 '16 at 09:46

S.H

votes

1 answer

Spark K-fold Cross Validation

I’m having some trouble understanding Spark’s cross validation. Any example I have seen uses it for parameter tuning, but I assumed that it would just do regular K-fold cross validation as well? What I want to do is to perform k-fold cross…

machine-learning classification apache-spark-mllib cross-validation

asked Jun 20 '16 at 09:43

other15

Prev 1 2 3

…

99 100 Next