Alternative to nested tuple/dict for stable baselines observation space

Question

I'm currently in the process of setting up a machine learning project using stable-baselines3 and gym. After setting up a basic skeleton for my environment, I ran the project and got the following error:

File "C:\Users\bucpa\anaconda3\envs\splendor_ai\lib\site-packages\stable_baselines3\common\preprocessing.py", line 214, in check_for_nested_spaces raise NotImplementedError( NotImplementedError: Nested observation spaces are not supported (Tuple/Dict space inside Tuple/Dict space).

Looking at the indicated code, I can see that stable-baselines does not allow Tuple/Dictionary spaces to be nested inside each other. I'm a bit stuck on how to work around this however. My observation data is based on the current state of a game and doesn't lend itself to being normalized into a flat list. Does anyone have a suggestion on how to solve this?

For reference my observation data is currently structured as follows:

{
                AFFORDABLE: {int, [], int, int}
                AVAILABLE_RESERVE: {int, [], int, int},
                PLAYER_OWNED: {int, [], int, int},
                PLAYER_RESERVED: {int, [], int, int},
                PLAYER_GEMS: [],
                BANK_GEMS: []
}

I attempted to initialize it using the following code:

self.observation_space = gym.spaces.Dict(
            spaces={
                AFFORDABLE: gym.spaces.Tuple((gym.spaces.Discrete(3, 1), gym.spaces.Box(low=0, high=7, shape=(5,), dtype=np.int32), gym.spaces.Discrete(6, 1), gym.spaces.Discrete(5, 0))),
                AVAILABLE_RESERVE: gym.spaces.Tuple((gym.spaces.Discrete(3, 1), gym.spaces.Box(low=0, high=7, shape=(5,), dtype=np.int32), gym.spaces.Discrete(6, 1), gym.spaces.Discrete(5, 0))),
                PLAYER_OWNED: gym.spaces.Tuple((gym.spaces.Discrete(3, 1), gym.spaces.Box(low=0, high=7, shape=(5,), dtype=np.int32), gym.spaces.Discrete(6, 1), gym.spaces.Discrete(5, 0))),
                PLAYER_RESERVED: gym.spaces.Tuple((gym.spaces.Discrete(3, 1), gym.spaces.Box(low=0, high=7, shape=(5,), dtype=np.int32), gym.spaces.Discrete(6, 1), gym.spaces.Discrete(5, 0))),
                PLAYER_GEMS: gym.spaces.Box(low=0, high=7, shape=(5,), dtype=np.int32),
                BANK_GEMS: gym.spaces.Box(low=0, high=7, shape=(5,), dtype=np.int32)
            })

Alternative to nested tuple/dict for stable baselines observation space

0 Answers0

Linked