I have a training set on which I would like to train a neural network, using K-folds cross validation.
TL;DR: Given the number of epochs
, the set of params
to be used, and checking on the test-set, how RandomizedSearchCV
trains the model? I would think that for a combination of params, it trains the model on (K-1) folds for epochs
number of epochs. Then it tests it on the last fold. But then, what prevent us from overfitting? When "vanilla" training with a constant validation set, after each epoch keras
checks it on the validation set, is it done here as well? Even though verbose=1
I don't see the scores from the fit on the remaining fold. I saw here that we can add callbacks
to the KerasClassifier
, but then, what happens if the settings of KerasClassifier
and RandomizedSearchCV
clash? Can I add there a callback to check the val_prc
, for exampl? If so, what would happen?
Sorry for the long TL;DR!
Regarding the training procedure, I am using the keras-sklearn
interface. I defined the model using
model = KerasClassifier(build_fn=get_model_, epochs=120, batch_size=32, verbose=1)
Where get_model_
is a function that returns a compiled tf.keras
model.
Given the model, the training procedure is the following:
params = dict({'l2':[0.1,0.3,0.5,0.8],
'dropout_rate':[0.1,0.3,0.5,0.8],
'batch_size':[16,32,64,128],
'learning_rate':[0.001, 0.01, 0.05, 0.1]})
def trainer(model, X, y, folds, params, verbose=None):
from keras.wrappers.scikit_learn import KerasClassifier
from tensorflow.keras.optimizers import Adam
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV
if not verbose:
v=0
else:
v = verbose
clf = RandomizedSearchCV(model,
param_distributions = params,
n_jobs = 1,
scoring="roc_auc",
cv = folds,
verbose = v)
# -------------- fit ------------
grid_result = clf.fit(X, y)
# summarize results
print('- '*40)
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
print('- '*40)
# ------ Training -------- #
trainer(model, X_train, y_train, folds, params, verbose=1)
First, do I use RandomizedSearchCV
right? Regardless of the number of options for each param
I get the same message: Fitting 5 folds for each of 10 candidates, totalling 50 fits
Second, I have a hard problem with imbalanced data + lack of data. Even so, I get unexpectedly low scores and high loss
values.
Lastly, and following the TL;DR, what is the training procedure that is actually being done using the above code, assuming that it is correct.
Thanks!