14

Because online learning does not work well with Keras when you are using an adaptive optimizer (the learning rate schedule resets when calling .fit()), I want to see if I can just manually set it. However, in order to do that, I need to find out what the learning rate was at the last epoch.

That said, how can I print the learning rate at each epoch? I think I can do it through a callback but it seems that you have to recalculate it each time and I'm not sure how to do that with Adam.

I found this in another thread but it only works with SGD:

class SGDLearningRateTracker(Callback):
    def on_epoch_end(self, epoch, logs={}):
        optimizer = self.model.optimizer
        lr = K.eval(optimizer.lr * (1. / (1. + optimizer.decay * optimizer.iterations)))
        print('\nLR: {:.6f}\n'.format(lr))
Zach
  • 1,243
  • 5
  • 19
  • 28

5 Answers5

12

I am using the following approach, which is based on @jorijnsmit answer:

def get_lr_metric(optimizer):
    def lr(y_true, y_pred):
        return optimizer._decayed_lr(tf.float32) # I use ._decayed_lr method instead of .lr
    return lr

optimizer = keras.optimizers.Adam()
lr_metric = get_lr_metric(optimizer)

model.compile(
    optimizer=optimizer,
    metrics=['accuracy', lr_metric],
    loss='mean_absolute_error', 
    )

It works with Adam.

Andrey
  • 5,932
  • 3
  • 17
  • 35
6

I found this question very helpful. A minimal workable example that answers your question would be:

def get_lr_metric(optimizer):
    def lr(y_true, y_pred):
        return optimizer.lr
    return lr

optimizer = keras.optimizers.Adam()
lr_metric = get_lr_metric(optimizer)

model.compile(
    optimizer=optimizer,
    metrics=['accuracy', lr_metric],
    loss='mean_absolute_error', 
    )
gosuto
  • 5,422
  • 6
  • 36
  • 57
  • This does answer seems copied by the one [before](https://stackoverflow.com/a/65058380/4886772) not work because the returned lr must be a tensor. – Luca Foppiano Jun 15 '23 at 01:29
  • @LucaFoppiano before? my answer dates from 2019, versus 2020 for the one you link. – gosuto Jun 24 '23 at 04:29
  • You're right. I did not see the date. Sorry about that. But perhaps maybe you could add the version of tensorflow you've been using at the time. – Luca Foppiano Jun 25 '23 at 22:59
5

For everyone that is still confused on this topic:

The solution from @Andrey works but only if you set a decay to your learning rate, you have to schedule the learning rate to lower itself after 'n' epoch, otherwise it will always print the same number (the starting learning rate), this is because that number DOES NOT change during training, you can't see how the learning rates adapts, because every parameter in Adam has a different learning rate that adapts itself during the training, but the variable lr NEVER changes

1
class MyCallback(Callback):
    def on_epoch_end(self, epoch, logs=None):
        lr = self.model.optimizer.lr
        # If you want to apply decay.
        decay = self.model.optimizer.decay
        iterations = self.model.optimizer.iterations
        lr_with_decay = lr / (1. + decay * K.cast(iterations, K.dtype(decay)))
        print(K.eval(lr_with_decay))

Follow this thread.

Tushar Gupta
  • 1,603
  • 13
  • 20
1

This piece of code might help you. It is based on Keras implementation of Adam optimizer (beta values are Keras defaults)

from keras import Callback
from keras import backend as K
class AdamLearningRateTracker(Callback):
    def on_epoch_end(self, logs={}):
        beta_1=0.9, beta_2=0.999
        optimizer = self.model.optimizer
        if optimizer.decay>0:
            lr = K.eval(optimizer.lr * (1. / (1. + optimizer.decay * optimizer.iterations)))
        t = K.cast(optimizer.iterations, K.floatx()) + 1
        lr_t = lr * (K.sqrt(1. - K.pow(beta_2, t)) /(1. - K.pow(beta_1, t)))
        print('\nLR: {:.6f}\n'.format(lr_t))
  • Your snippet is a good baseline, but it has multiple errors (at least for Keras 2.2.4 with Tensorflow 1.13.1): 1. The function signature should be `def on_epoch_end(self, epoch, logs=None):` 2. `optimizer.decay` is a tensor and cannot be evaluated as bool, needs a `K.eval` 3. `lr` is not defined if `decay` is 0 4. `lr_t` is a tensor and cannot be used in a format string, needs a `K.eval` as well – Arno Hilke May 28 '19 at 13:46