I have created a Gym environment and am able to train it via PPO from Stable Baselines3. However, I am not getting desired results. The agent seems to be stuck at the local optimum rather than the global one. However, even reaching this stage required a lot of computing and I am unsure if I have proper parameter settings. I would like to perform hyperparameter tuning for this custom env and I am following this example - https://rl-baselines3-zoo.readthedocs.io/en/master/guide/tuning.html#hyperparameters-search-space
I am unable to follow the document.
It says that they use Optuna for optimization. However, nowhere is it mentioned that it must be installed first. Do we need to do that?
I ran my code using the same structure
python train.py --algo ppo --env MountainCar-v0 -n 50000 -optimize --n-trials 1000 --n-jobs 2 --sampler tpe --pruner median
My environment
python mobile_robot_scripts/train_youbot_camera.py --algo ppo --env youbotCamGymEnv -n 10000 --n-trials 1000 --n-jobs 2 --sampler tpe --pruner median
Now, there are a couple of things that I think are wrong. I have not registered my env with gym, unlike the default MountainCar-v0 example because nowhere is it mentioned that registering is required or has any advantage apart from gym.make() convenience. I do not have Optuna installed. And still while running the code it runs, but I don't think it does any hyperparameter tuning, rather just continues with normal training.
I would like to know how should I proceed further to actually perform hyperparameter tuning. As you can see, I am pretty much using the defaults. Here is my code for reference:-
#add parent dir to find package. Only needed for source code build, pip install doesn't need it.
import os, inspect
currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
parentdir = os.path.dirname(os.path.dirname(currentdir))
os.sys.path.insert(0, parentdir)
from youbotCamGymEnv import youbotCamGymEnv
import datetime
from stable_baselines3 import ppo
from stable_baselines3.common.env_checker import check_env
def main():
env = youbotCamGymEnv(renders=False, isDiscrete=False)
# It will check your custom environment and output additional warnings if needed
check_env(env)
model = ppo.PPO("CnnPolicy", env, verbose=1)
model.learn(total_timesteps = 50000)
print("############Training completed################")
model.save(os.path.join(currentdir,"youbot_camera_trajectory"))
# del model
# env = youbotCamGymEnv(renders=True, isDiscrete=False)
model = ppo.PPO.load(os.path.join(currentdir,"youbot_camera_trajectory"))
# obs = env.reset()
# for i in range(1000):
# action, _states = model.predict(obs, deterministic=True)
# obs, reward, done, info = env.step(action)
# # print("reward is ", reward)
# env.render(mode='human')
# if done:
# obs = env.reset()
# env.close()
if __name__ == '__main__':
main()