0

I am trying to write a custom learning rate scheduler for SGD in Keras, which change the learning rate base on iteration. However, the LearningRateScheduler CallBack only accepts a function which takes epoch only. My learning rate function looks like this:

Learning rate = base_learning_rate x (1 + gamma x iteration)^(-power)

Kaveh
  • 4,618
  • 2
  • 20
  • 33

2 Answers2

0

This can be achieved by defining your own tf.keras.optimizers.schedules.LearningRateSchedule and passing it into the optimizer.

class Example(tf.keras.optimizers.schedules.LearningRateSchedule):

  def __init__(self, initial_learning_rate, gamma, power):
    self.initial_learning_rate = initial_learning_rate
    self.gamma = gamma
    self.power = power

  def __call__(self, step):
     return self.initial_learning_rate * tf.pow((step*self.gamma+1),-self.power)

optimizer = tf.keras.optimizers.SGD(learning_rate=Example(0.1,0.001,2))

Reference: https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/schedules/LearningRateSchedule

Laplace Ricky
  • 1,540
  • 8
  • 7
  • I have already tried that idea. However, the loss value increased dramatically. As in Keras document, they suggest using Callback instead of passing a custom Schedule. FYI, I tried to reproduce [this paper](https://hal.inria.fr/hal-01799574/document) and [my custom schedule](https://codeshare.io/zydxwj). – Casey Werner Jun 12 '21 at 11:57
  • As long as you have made sure that your implementation of LearningRateSchedule returns expected values when calling it in graph mode, I think the exploding gradient problem came from other causes like the initial learning rate was set too huge. But it maybe off topic to your current question. It would be better for you to raise a new question so that you can describe the issue in more details, therefore more people can help. – Laplace Ricky Jun 12 '21 at 13:41
  • OK, I will create a new question. It should be something wrong with my scheduler or optimizer. Thank you! – Casey Werner Jun 12 '21 at 14:18
0

when you say "change the learning rate base on iteration'' do you mean you want to change it at the end of each batch? If so you can do that with a custom callback. I have not tested this but the code would be something like

class LRA(keras.callbacks.Callback):
    def __init__(self,model, initial_learning_rate, gamma, power):
        super(LRA, self).__init__()
        self.initial_learning=initial_learning
        self.gamma=gamma
        self.power= power
        self.model=model # model is your compiled model
    def on_train_begin(self, logs=None):
        tf.keras.backend.set_value(self.model.optimizer.lr, 
                                    self.initial_learning_rate)
   def on_train_batch_end(self, batch, logs=None):
       lr=self.initial_learning_rate * tf.pow(((batch+1)*self.gamma+1),-self.power)
       tf.keras.backend.set_value(self.model.optimizer.lr, lr)
       # print('for ', batch, ' lr set to ', lr) remove comment if you want to see lr change

let me know if this works, again I have not tested it as yet

before you run model.fit include code
initial_learning_rate= .001  # set to desired value
gamma=   # set to desired value
power=   # set to desired value
callbacks=[LRA(model=model, initial_learning_rate=initial_learning_rate, gamma=gamma, power=power)                           
   
        
   
Gerry P
  • 7,662
  • 3
  • 10
  • 20
  • Yes, I do mean "change it at the end of each batch". I'll try your solution now. I also explain in detail my problem [here] (https://stackoverflow.com/questions/67949878/exploding-loss-value-on-training). – Casey Werner Jun 13 '21 at 00:55