I'm trying to modify the MountainCarContinuous-v0 environment from suite_gym()
because training is getting stuck in a local minima. The default reward function penalizes large actions which are preferred for optimal solving. So I would like to try other reward functions to see if I can get it to train properly.
In base gym, I can use a wrapper as follows:
import gymnasium as gym
env = gym.make("MountainCarContinuous-v0")
wrapped_env = gym.wrappers.TransformReward(env, lambda r: 0 if r <= 0 else 1)
state = wrapped_env.reset()
state, reward, done = wrappped_env.step([action])
# reward will now always be 0 or 1 depending on whether it reached the goal or not.
How can you do the same thing inside tf_agents?
env = suite_gym.load("MountainCarContinuous-v0")
env = None # how do I modify the underlying environment?
I have tried modifying env._env
, but get a reset() got an unexpected keyword argument 'seed' error
Another approach seems to be trying the following, but I don't know how to pass the gym wrapper here, especially because it requires a function inside the Class init.
test = suite_gym.load("MountainCarContinuous-v0", gym_env_wrappers=[gym.wrappers.TransformReward])
What is the proper way to get ahold of the underlying gym env and modify it?