I found that restore_best_weights=True
does not actually restore the best behavior. A simplified example with some dummy data:
import numpy as np
from tensorflow.keras.utils import set_random_seed
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.callbacks import EarlyStopping
np.random.seed(1)
set_random_seed(2)
x = np.array([1., 2., 3., 4., 5.])
y = np.array([1., 3., 4., 2., 5.])
model = Sequential()
model.add(Dense(2, input_shape=(1, ), activation='tanh'))
model.add(Dense(4, activation='relu'))
model.add(Dense(1))
model.compile(optimizer=RMSprop(learning_rate=0.1), loss='mse')
stopmon = EarlyStopping(monitor='loss', patience=2, restore_best_weights=True, verbose=1)
history = model.fit(x, y, epochs=100, verbose=2, callbacks=[stopmon])
res = model.evaluate(x, y, verbose=1)
print(f'best={stopmon.best:.4f}, loss={res:.4f}')
The output (on my system) is:
Epoch 1/100
1/1 - 0s - loss: 11.8290 - 434ms/epoch - 434ms/step
Epoch 2/100
1/1 - 0s - loss: 1.9091 - 0s/epoch - 0s/step
Epoch 3/100
1/1 - 0s - loss: 1.5159 - 16ms/epoch - 16ms/step
Epoch 4/100
1/1 - 0s - loss: 1.3921 - 0s/epoch - 0s/step
Epoch 5/100
1/1 - 0s - loss: 1.6787 - 0s/epoch - 0s/step
Epoch 6/100
Restoring model weights from the end of the best epoch: 4.
1/1 - 0s - loss: 2.0629 - 33ms/epoch - 33ms/step
Epoch 6: early stopping
1/1 [==============================] - 0s 100ms/step - loss: 1.6787
best=1.3921, loss=1.6787
It looks like the weights are set to those from epoch 4. Then why does the loss still evaluate to the higher value from epoch 6? Is there anything extra I should do to update the model or something?
I use an up-to-date TensorFlow (version 2.12.0) on Windows x64 (Intel), tf.version.COMPILER_VERSION == 'MSVC 192930140'
.