0

I would like to run the following code but instead of Cartpole use a custom environment:

import ray
import ray.rllib.agents.dqn.apex as apex
from ray.tune.logger import pretty_print


def train_cartpole() -> None:
    ray.init()
    config = apex.APEX_DEFAULT_CONFIG.copy()
    config["num_gpus"] = 0
    config["num_workers"] = 3
    trainer = apex.ApexTrainer(config=config, env="CartPole-v0")

    for _ in range(1000):
        # Perform one iteration of training the policy with Apex-DQN
        result = trainer.train()
        print(pretty_print(result))


train_cartpole()

My environment is defined as a gym.Env class and I want to create it using gym.make and then apply a wrapper to it and gym's FlattenObservation(). I have found ways of providing the environment as a class or a string, but that does not work for me because I do not know how to apply the wrappers afterwards.

blindeyes
  • 409
  • 3
  • 13
  • Have you seen this question? https://stackoverflow.com/questions/58551029/rllib-use-custom-registered-environments/60792871#60792871 – hubbs5 Aug 17 '22 at 21:55

1 Answers1

0

You can register your env with ray and use it as if it was a gym environment.

from ExampleEnv import ExampleEnv
from ray.tune.registry import register_env
from gymnasium.wrappers import FlattenObservation

def env_creator(env_config):
    # wrap and return an instance of your custom class
    return FlattenObservation(ExampleEnv())

# Choose a name and register your custom environment
register_env("ExampleEnv-v0", env_creator)

# example config using your custom env
config = {
    "env": "ExamleEnv-v0",
    # Change the following line to `“framework”: “tf”` to use tensorflow
    "framework": "torch",
    "model": {
        "fcnet_hiddens": [32],
        "fcnet_activation": "linear",
    },
}

Next just use ray normally, for example:

ray.shutdown()
ray.init(
    num_cpus=3,
    include_dashboard=False,
    ignore_reinit_error=True,
    log_to_driver=False,
)
# execute training
analysis = ray.tune.run(
    "PPO",
    config=config,
    stop={"episode_reward_mean": 100},
    checkpoint_at_end=True,
)
Aaron B.
  • 3
  • 2