0

I have created a Gym environment and am able to train it via PPO from Stable Baselines3. However, I am not getting desired results. The agent seems to be stuck at the local optimum rather than the global one. However, even reaching this stage required a lot of computing and I am unsure if I have proper parameter settings. I would like to perform hyperparameter tuning for this custom env and I am following this example - https://rl-baselines3-zoo.readthedocs.io/en/master/guide/tuning.html#hyperparameters-search-space

I am unable to follow the document.

  1. It says that they use Optuna for optimization. However, nowhere is it mentioned that it must be installed first. Do we need to do that?

  2. I ran my code using the same structure

    python train.py --algo ppo --env MountainCar-v0 -n 50000 -optimize --n-trials 1000 --n-jobs 2 --sampler tpe --pruner median
    

My environment

python mobile_robot_scripts/train_youbot_camera.py --algo ppo --env youbotCamGymEnv -n 10000 --n-trials 1000 --n-jobs 2 --sampler tpe --pruner median
   

Now, there are a couple of things that I think are wrong. I have not registered my env with gym, unlike the default MountainCar-v0 example because nowhere is it mentioned that registering is required or has any advantage apart from gym.make() convenience. I do not have Optuna installed. And still while running the code it runs, but I don't think it does any hyperparameter tuning, rather just continues with normal training.

I would like to know how should I proceed further to actually perform hyperparameter tuning. As you can see, I am pretty much using the defaults. Here is my code for reference:-

#add parent dir to find package. Only needed for source code build, pip install doesn't need it.
import os, inspect
currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
parentdir = os.path.dirname(os.path.dirname(currentdir))
os.sys.path.insert(0, parentdir)

from youbotCamGymEnv import youbotCamGymEnv
import datetime
from stable_baselines3 import ppo
from stable_baselines3.common.env_checker import check_env 

def main():  

  env = youbotCamGymEnv(renders=False, isDiscrete=False)
# It will check your custom environment and output additional warnings if needed
  check_env(env)

  model = ppo.PPO("CnnPolicy", env, verbose=1)
  model.learn(total_timesteps = 50000)
  print("############Training completed################")
  model.save(os.path.join(currentdir,"youbot_camera_trajectory"))
  
  # del model

  # env = youbotCamGymEnv(renders=True, isDiscrete=False)
  model = ppo.PPO.load(os.path.join(currentdir,"youbot_camera_trajectory"))
  # obs = env.reset()

  # for i in range(1000):
  #     action, _states = model.predict(obs, deterministic=True)
  #     obs, reward, done, info = env.step(action)
  #     # print("reward is ", reward)
  #     env.render(mode='human')
  #     if done:
  #       obs = env.reset()

  # env.close()

if __name__ == '__main__':
  main()

Manish
  • 458
  • 6
  • 19
  • 1
    you could and should use optuna for that and yes, it must be installed. You may use the basic example: https://github.com/optuna/optuna-examples/blob/main/rl/sb3_simple.py, it is pretty self-explanatory. Replace the line 120 there with instantiation of your custom environment. – gehirndienst May 05 '23 at 14:21
  • I am still struggling to run the hyperparameter tuning example. I have replaced the line with custom env in my repo (Line 38,125) - https://github.com/manish-nayak/pybullet-stable-baselines/blob/main/mobile_robot_scripts/youbot_optuna.py But I am getting the following error - "Training and eval env are not of the same type" f"{self.training_env} != {self.eval_env}") Also, it doesn't seem to even start training, which is happening with a default parameter set as mentioned previously - https://github.com/manish-nayak/pybullet-stable-baselines/blob/main/mobile_robot_scripts/train_youbot_camera.py – Manish May 07 '23 at 12:39

1 Answers1

0

I think you used RL Zoo in a wrong way. You shouldn't run your own train.py (train_youbot_camera.py). You are not passing any arguments in your script, so --algo ppo --env youbotCamGymEnv -n 10000 --n-trials 1000 --n-jobs 2 --sampler tpe --pruner median none of these arguments are actually passed into your program.

According to the documentation, only run python train.py when you are inside RL Zoo folder. Otherwise, you should run python -m rl_zoo3.train. Also, I don't think you need to write your own script for training, because the goal of rl_zoo3 is to prepare everything for you.

Back to your question, I think Optuna is installed when you install rl_zoo3.

You should register your environment, it is mentioned in the documentation. Check this https://rl-baselines3-zoo.readthedocs.io/en/master/guide/custom_env.html