Difference between doing cross-validation and validation_data/validation_split in Keras

Question

First, I split the dataset into train and test, for example:

X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.4, random_state=999)

I then use GridSearchCV with cross-validation to find the best performing model:

validator  = GridSearchCV(estimator=clf, param_grid=param_grid, scoring="accuracy", cv=cv)

And by doing this, I have:

A model is trained using k-1 of the folds as training data; the resulting model is validated on the remaining part of the data (scikit-learn.org)

But then, when reading about Keras fit fuction, the document introduces 2 more terms:

validation_split: Float between 0 and 1. Fraction of the training data to be used as validation data. The model will set apart this fraction of the training data, will not train on it, and will evaluate the loss and any model metrics on this data at the end of each epoch. The validation data is selected from the last samples in the x and y data provided, before shuffling.

validation_data: tuple (x_val, y_val) or tuple (x_val, y_val, val_sample_weights) on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data. validation_data will override validation_split.

From what I understand, validation_split (to be overridden by validation_data) will be used as an unchanged validation dataset, meanwhile hold-out set in cross-validation changes during each cross-validation step.

First question: is it necessary to use validation_split or validation_data since I already do cross validation?
Second question: if it is not necessary, then should I set validation_split and validation_data to 0 and None, respectively?
```
grid_result = validator.fit(train_images, train_labels, validation_data=None, validation_split=0)
```
Question 3: If I do so, what will happen during the training, would Keras just simply ignore the validation step?
Question 4: Does the validation_split belong to k-1 folds or the hold-out fold, or will it be considered as "test set" (like in the case of cross validation) which will never be used to train the model.

score 4 · Accepted Answer · answered Nov 07 '18 at 13:41

Validation is performed to ensure that the model is not overfitting on the dataset and it would generalize to new data. Since in the parameters grid search you are also doing validation then there is no need to perform the validation step by the Keras model itself during training. Therefore to answer your questions:

is it necessary to use validation_split or validation_data since I already do cross validation?

No, as I mentioned above.

if it is not necessary, then should I set validation_split and validation_data to 0 and None, respectively?

No, since by default no validation is done in Keras (i.e. by default we have validation_split=0.0, validation_data=None in fit() method).

If I do so, what will happen during the training, would Keras just simply ignore the validation step?

Yes, Keras won't perform the validation when training the model. However note that, as I mentioned above, the grid search procedure would perform validation to better estimate the performance of the model with a specific set of parameters.

Silly me, I was not aware of the default setting in Keras fit() function. I know it sounds odd but could you please explain what would happen if I set validation_split=0.1 and also set cv=5 ? Actually I did, and it seems like Keras tries to validate the validation_split as well. But in that case - what happened under the hood. Does the validation_split belong to k-1 folds or the hold-out fold, , or will it be considered as "test set" (like in the case of cross validation) which will never be used to train the model? — Long, Nov 07 '18 at 14:15
@Long To better estimate the performance of the model with a specific set of hyperparameters, Grid search would do K-fold cross-validation: it would split the data to k chunks (i.e. folds), and would give k-1 of the chunks to Keras model for training and put aside one of the chunks for evaluation of trained model. Now if you also tell the Keras model to perform validation, it would take a portion of that given k-1 chunks and use it for validation. But that's unnecessary because it would not affect the training procedure and plus it may harm the validation done by the grid search as well. — today, Nov 07 '18 at 14:44

Difference between doing cross-validation and validation_data/validation_split in Keras

1 Answers1