I'm using TF-Agents library for reinforcement learning, and I would like to take into account that, for a given state, some actions are invalid.
How can this be implemented?
Should I define a "observation_and_action_constraint_splitter" function when creating the DqnAgent?
If yes: do you know any tutorial on this?