Formulation of a reward structure

Question

I am new to reinforcement learning and experimenting with training of RL agents.

I have a doubt about reward formulation, from a given state if a agent takes a good action i give a positive reward, and if the action is bad, i give a negative reward. So if i give the agent very high positive rewards when it takes a good action, like 100 times positive value as compared to negative rewards, will it help agent during the training?

Intuitively I feel, it will help the agent training, but will there be any drawbacks of such skewed reward structure?

score 1 · Answer 1 · answered Nov 27 '19 at 13:25

Well, generally I(personal opinion based on my experience) think that rewards should be relative to the impact it has on the agent. If the problem is sparse rewards, you can have a look at this Arxiv Insights Youtube to see how that can be solved.

I can give one example that might be challenging: if the reward is much more positive than the bad rewards are negative, the agent will probably not care too much if it risks ending up in the states with negative rewards to acquire the big positive reward. So you might end up with a risky agent.

Formulation of a reward structure

1 Answers1