Im a new user of Keras. I have a question about training procedure using Keras.
Due to the time limitation of my server (each job can only run in less than 24h), I have to train my model using multiple 10-epoch period.
At 1st period of training, after 10 epochs, the weights of best model is stored using ModelCheckpoint of Keras.
conf = dict()
conf['nb_epoch'] = 10
callbacks = [
ModelCheckpoint(filepath='/1st_{epoch:d}_{val_loss:.5f}.hdf5',
monitor='val_loss', save_best_only=True,
save_weights_only=False, verbose=0)
]
Assume I get best model: '1st_10_1.00000.hdf5'. Next, I continue training my model using 10 epochs and store the weights of best model as follows.
model.load_weights('1st_10_1.00000.hdf5')
model.compile(...)
callbacks = [
ModelCheckpoint(filepath='/2nd_{epoch:d}_{val_loss:.5f}.hdf5',
monitor='val_loss', save_best_only=True,
save_weights_only=False, verbose=0)
]
But I have a problem. 1st epoch of the second training gives val_loss of 1.20000, and the script produces a model '2nd_1_1.20000.hdf5'. Obviously, the new val_loss is greater than the best val_loss of the first training (1.00000). And the following epochs of second training seem to be trained based on the model '2nd_1_1.20000.hdf5', not '1st_10_1.00000.hdf5'.
'2nd_1_1.20000.hdf5'
'2nd_1_2.15000.hdf5'
'2nd_1_3.10000.hdf5'
'2nd_1_4.05000.hdf5'
...
I think it is a waste not using the better result of first training period. Anyone can point me out the way to fix it, or the way to tell program that it should use the best model of the previous training period? Many thanks in advance!