1

I'm trying to add illegal action masking to my dqn agent using masked_epsilon_greedy. Does anyone know how can I update the policy network to use observation["your_key_for_observation"] rather than 'observation' since the observation space is a dictionary containing both the observations and legal actions?

Lucas Hendren
  • 2,786
  • 2
  • 18
  • 33
Echo
  • 11
  • 1

1 Answers1

0

the answer is adding lambda inputs: inputs["your_key_for_observation"] to the network in case someone encounters this issue in the future.

Echo
  • 11
  • 1