I'm trying to add illegal action masking to my dqn agent using masked_epsilon_greedy.
Does anyone know how can I update the policy network to use observation["your_key_for_observation"]
rather than 'observation' since the observation space is a dictionary containing both the observations and legal actions?
Asked
Active
Viewed 176 times
1

Lucas Hendren
- 2,786
- 2
- 18
- 33

Echo
- 11
- 1
1 Answers
0
the answer is adding lambda inputs: inputs["your_key_for_observation"]
to the network in case someone encounters this issue in the future.

Echo
- 11
- 1