0

I have a custom gym environment where I am implementing action masking. The code below works well

from toyEnv_mask import PortfolioEnv # file with my custom gym environment
from sb3_contrib import MaskablePPO
from sb3_contrib.common.wrappers import ActionMasker
from stable_baselines3.common.vec_env import DummyVecEnv
from stable_baselines3.common.env_util import make_vec_env
from sb3_contrib.common.maskable.utils import get_action_masks



env = PortfolioEnv()
env = ActionMasker(env,action_mask_fn=PortfolioEnv.action_masks)
env = DummyVecEnv([lambda: env])

model = MaskablePPO("MlpPolicy", env, gamma=0.4, verbose=1, tensorboard_log="./MPPO_tensorboard/")
...

i want to parallelize this and I am trying to use the make_vec_env() class. This sets up ok and starts running, but it does not respect the action masks anymore. This is the line i am using to replace the 3 lines above where i initialize the env.

env = make_vec_env(PortfolioEnv, wrapper_class=ActionMasker,wrapper_kwargs={'action_mask_fn':PortfolioEnv.action_masks} ,n_envs=2)

Any suggestions / help would be greatly appreciated.

Generalenthu
  • 107
  • 1
  • 7

1 Answers1

0

i was making it unnecessarily complicated.

a simple env = make_vec_env(PortfolioEnv,n_envs=16) works

and the 3 lines that were working previously could have been just

env = PortfolioEnv()
env = DummyVecEnv([lambda: env])
Generalenthu
  • 107
  • 1
  • 7