2

Suppose the action space is a game with 5 doors and you can choose 2 and only 2 at each step. How could that be represented as an action_space?

self.action_space = spaces.Box( np.array([0,0,0,0,0]), np.array([+1,+1,+1,+1,+1]))  #

Using the above method the action_space could be none [0 0 0 0 0] or all [1 1 1 1 1] or anything in between. I am trying to force the action to choose only 2 doors.

Examples of correct actions:

[1 1 0 0 0]
[1 0 1 0 0]
etc.
Kevin
  • 3,690
  • 5
  • 35
  • 38

1 Answers1

1

Probably, the simplest solution would be to list all the possible actions, i.e., all the allowed combinations of two doors, and assign a number to each one. Then the environment must "decode" each number to the corresponding combination of two doors.

In this way, the agent should simply choose among a discrete action space (spaces.Discrete(n) in OpenAI).

Pablo EM
  • 6,190
  • 3
  • 29
  • 37