I have a discrete action space with 4 values as arrays that I am using in a custom Gym environment (the observation space is a Box):
[0, 0, 0, 0] action 1
[1, 0, 0, 0] action 2
[0, 1, 1, 0] action 3
[1, 0, 1, 0] action 4
The problem is that I am testing a model-based RL algorithm and it is designed for continuous action spaces.
What is the best way to convert this action space to a continuous one? How does it work with upper and lower bounds? Is it possible to one-hot encode discrete actions to be represented as continuous?