Best model weights in neural network in case of early stopping

Question

I am training a model with the following code

model=Sequential()
model.add(Dense(100, activation='relu',input_shape=(n_cols,)))
model.add(Dense(100, activation='relu'))
model.add(Dense(2,activation='softmax'))
model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])
early_stopping_monitor = EarlyStopping(patience=3)
model.fit(X_train_np,target,validation_split=0.3, epochs=100, callbacks=[early_stopping_monitor])

This is designed to stop the training if the val_loss: parameter does not improve after 3 epochs. The result is shown below. My question is will the model stop with weights of epoch 8 or 7. Because the performance got bad in epoch 8 so it stopped. But the model went ahead by 1 epoch with a bad performing parameter as earlier one (epoch 7) was better. Do I need to retrain the model now with 7 epochs?

Train on 623 samples, validate on 268 samples
Epoch 1/100
623/623 [==============================] - 1s 1ms/step - loss: 4.0365 - accuracy: 0.5923 - val_loss: 1.2208 - val_accuracy: 0.6231
Epoch 2/100
623/623 [==============================] - 0s 114us/step - loss: 1.4412 - accuracy: 0.6356 - val_loss: 0.7193 - val_accuracy: 0.7015
Epoch 3/100
623/623 [==============================] - 0s 103us/step - loss: 1.4335 - accuracy: 0.6260 - val_loss: 1.3778 - val_accuracy: 0.7201
Epoch 4/100
623/623 [==============================] - 0s 106us/step - loss: 3.5732 - accuracy: 0.6324 - val_loss: 2.7310 - val_accuracy: 0.6194
Epoch 5/100
623/623 [==============================] - 0s 111us/step - loss: 1.3116 - accuracy: 0.6372 - val_loss: 0.5952 - val_accuracy: 0.7351
Epoch 6/100
623/623 [==============================] - 0s 98us/step - loss: 0.9357 - accuracy: 0.6645 - val_loss: 0.8047 - val_accuracy: 0.6828
Epoch 7/100
623/623 [==============================] - 0s 105us/step - loss: 0.7671 - accuracy: 0.6934 - val_loss: 0.9918 - val_accuracy: 0.6679
Epoch 8/100
623/623 [==============================] - 0s 126us/step - loss: 2.2968 - accuracy: 0.6629 - val_loss: 1.7789 - val_accuracy: 0.7425

score 0 · Answer 1 · answered May 03 '20 at 12:44

0

Use restore_best_weights with monitor value set to target quantity. So, the best weights will be restored after training automatically.

early_stopping_monitor = EarlyStopping(patience=3, 
                                       monitor='val_loss',  # assuming it's val_loss
                                       restore_best_weights=True )

From docs:

restore_best_weights: whether to restore model weights from the epoch with the best value of the monitored quantity ('val_loss' here). If False, the model weights obtained at the last step of training are used (default False).

Docmentation link

answered May 03 '20 at 12:44

Mikhail Stepanov

3,680
3
23
24

This does not apply to the main topic. But you may want to use `ModelCheckpoint` to save weights during training. Also, the training curve is quite unstable, so check learning rate value etc. – Mikhail Stepanov May 03 '20 at 12:49
Thanks, this helps a lot :) I will take your suggestion, what did you observe in the result which made you say that the curve is unstable? – jyotiska May 03 '20 at 13:28
@jyotiska these graphs does not decreasing epoch by epoch: val loss ...0.71, 1.38, 2.73, 0.59...; and train loss ...1.44, 1.43, 3.57, 1.31.... I don't know it is due to relatively small dataset size or the reason is high learning rate (so the model ca't stop in local minima). – Mikhail Stepanov May 03 '20 at 19:03

score 0 · Answer 2 · answered May 03 '20 at 12:55

All the code that I have placed is in TensorFlow 2.0

file path: Is a string that can have formatting options such as the epoch number. For example the following is a common filepath (weights.{epoch:02d}-{val_loss:.2f}.hdf5)
monitor: (typically it is‘val_loss’or ‘val_accuracy’)
mode: Should it be minimizing or maximizing the monitor value (typically either ‘min’ or ‘max’)
save_best_only: If this is set to true then it will only save the model for the current epoch, if it’s metric values, is better than what has gone before. However, if you set save_best_only to false it will save every model after each epoch (regardless of whether that model was better than previous models or not).

Code

model=Sequential()
model.add(Dense(100, activation='relu',input_shape=(n_cols,)))
model.add(Dense(100, activation='relu'))
model.add(Dense(2,activation='softmax'))
model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])
fname = "weights.{epoch:02d}-{val_loss:.2f}.hdf5"
checkpoint = tf.keras.callbacks.ModelCheckpoint(fname, monitor="val_loss",mode="min", save_best_only=True, verbose=1) 
model.fit(X_train_np,target,validation_split=0.3, epochs=100, callbacks=[checkpoint])

Best model weights in neural network in case of early stopping

2 Answers2