0

Like Q learning we have reward feedback does that mean the agent need to know in advance?

Max
  • 35
  • 5

1 Answers1

0

The agent need not have knowledge about the rewards function. But it should get a reward for every step taken. Note that we can have zero rewards until the episode ends. The term reward feedback means that there is some scalar value given for each transition.

nsidn98
  • 1,037
  • 1
  • 9
  • 23