0

In deep reinforcement learning, is there any way to decay learning rate wrt to cumulative reward. I mean, decay learning rate when the agent is able to learn and maximize the reward?

1 Answers1

0

It is common to modify learning rates with number of steps, so it would certainly be possible to modify learning rates as a function of cumulative reward.

One risk would be that you do not know what reward you are seeking at the beginning of training, so reducing the learning rate too early is a common problem. If you target a reward of 80, with the learning rate declining sharply as you attain that value, you will never know if your algorithm could have attained 90, as learning will stop at 80.

Another problem is setting the target too high. If you set the target for 100, meaning that the learning rate does not reduce as you reach 85, the instability may mean that the algorithm cannot converge well enough to reach 90.

So in general, I think people try a variety of learning schedules, and if possible sometimes let the algorithms run for plenty of time to see if they converge.

dilaudid
  • 163
  • 2
  • 6
  • Thank you for the answer. How about reducing learning rate to 0 once I reach 'maximum' reward? – M. Awais Jadoon Jul 01 '20 at 16:00
  • In RL it is common to reach maximum reward through pure chance (in some games the reward is binary, win or lose, e.g. blackjack). In other games it is possible that you will never reach maximum reward on any run (e.g. in some atari games, there is no maximum reward). If you were consistently hitting maximum reward on several successive runs, then that could be a good sign it is time to stop training, but again, how many times is enough? It's very subjective. – dilaudid Jul 02 '20 at 15:55