tf_agents changing underlying suite_gym reward function

Question

I'm trying to modify the MountainCarContinuous-v0 environment from suite_gym() because training is getting stuck in a local minima. The default reward function penalizes large actions which are preferred for optimal solving. So I would like to try other reward functions to see if I can get it to train properly.

In base gym, I can use a wrapper as follows:

import gymnasium as gym

env = gym.make("MountainCarContinuous-v0")
wrapped_env = gym.wrappers.TransformReward(env, lambda r: 0 if r <= 0 else 1)
state = wrapped_env.reset()
state, reward, done = wrappped_env.step([action])
# reward will now always be 0 or 1 depending on whether it reached the goal or not.

How can you do the same thing inside tf_agents?

env = suite_gym.load("MountainCarContinuous-v0")
env = None  #  how do I modify the underlying environment?

I have tried modifying env._env, but get a reset() got an unexpected keyword argument 'seed' error

Another approach seems to be trying the following, but I don't know how to pass the gym wrapper here, especially because it requires a function inside the Class init.

    test = suite_gym.load("MountainCarContinuous-v0", gym_env_wrappers=[gym.wrappers.TransformReward])

What is the proper way to get ahold of the underlying gym env and modify it?

tf_agents changing underlying suite_gym reward function

0 Answers0