0

I am new to RL and the best I've done is CartPole in openAI gym. In cartPole, the API automatically provides the reward given the action taken. How am I supposed to decide the reward when all I have is pixel data and no "magic function" that could tell the reward for a certain action.

Say, I want to make a self driving bot in GTA San Andreas. The input I have access to are raw pixels. How am I supposed to figure out the reward for a certain action it takes?

Jules G.M.
  • 3,624
  • 1
  • 21
  • 35
ParmuTownley
  • 957
  • 2
  • 14
  • 34

1 Answers1

2

You need to make up a reward that proxies the behavior you want - and that is actually no trivial business.

If there is some numbers on a fixed part of the screen representing score, then you can use old fashioned image processing techniques to read the numbers and let those be your reward function.

If there is a minimap in a fixed part of the screen with fixed scale and orientation, then you could use minus the distance of your character to a target as reward.

If there are no fixed elements in the UI you can use to proxy the reward, then you are going to have a bad time, unless you can somehow access the internal variables of the console to proxy the reward (using the position coordinates of your PC, for example).

Jsevillamol
  • 2,425
  • 2
  • 23
  • 46