Questions tagged [stable-baselines]

Stable Baselines is a library with implementations of various reinforcement learning algorithms in Python, developed by OpenAI. Please mention the exact version of Stable Baselines that is being used in the body of the question.

277 questions
0
votes
1 answer

StableBaselines3 / PPO / model rollsout but does not learn?

When a model learns there is: A rollout phase A learning phase My models are rolling out but they never show a learning phase. This is apparent both in the text output in a jupyter Notebook in vscode as well as in tensorboard. I built a very…
user3533030
  • 359
  • 3
  • 17
0
votes
1 answer

What is missing in Gym registration?

I get ValueError: xxx not found in gym registry, you maybe meant when trying to register a custom environment in stable baselines 3. I tried the following command: python train.py --algo tqc --env donkey-mountain-track-v0 --eval-freq -1…
0
votes
1 answer

How can I improve this Reinforced Learning scenario in Stable Baselines3?

In this scenario, I present a box observation with numbers 0, 1 or 2 and shape (1, 10). The odds for 0 and 2 are 2% each, and 96% for 1. I want the model to learn to pick the index of any 2 that comes. If it doesn't have a 2, just choose 0. Bellow…
0
votes
1 answer

stable-baseline3, gym, train while also step/predict

With stable-baselines3 given an agent, we can call "action = agent.predict(obs)". And then with Gym, this would be "new_obs, reward, done, info = env.step(action)". (more or less, maybe missed an input or an output). We also have…
Oren
  • 976
  • 9
  • 23
0
votes
2 answers

Accessing training metrics in stable-baselines3

Is it possible to access the A2C total loss and whether the environment truncated or terminated within a custom callback? I'd like to access truncated and terminated in _on_step. That would allow me to terminate training when the environment…
0
votes
1 answer

Stable-Baselines make_vec_env() not calling the wrapper kwargs as expected

I have a custom gym environment where I am implementing action masking. The code below works well from toyEnv_mask import PortfolioEnv # file with my custom gym environment from sb3_contrib import MaskablePPO from sb3_contrib.common.wrappers import…
Generalenthu
  • 107
  • 1
  • 7
0
votes
0 answers

Problem registering custom gym: gym.error.UnregisteredEnv

I created a custom gym called BazEnv for use with stable_baselines3 but I cannot seem to register it properly. I get the error gym.error.UnregisteredEnv: No registered env with id: foo_bar_envs/BazEnv-v0 when I try to create the custom gym…
0
votes
0 answers

How to rescale the output of PPO model to the range of the action space in PPO?

Stable baselines3 PPO implementation using MLP policy uses two hidden layers with 64 nodes each. On setting my gym environment, I had set my action space in the range [-50,50]. However, there seem to be no such bounds on the model output in the PPO…
0
votes
1 answer

Use MiniGrid environment with stable-baseline3

I'm using MiniGrid library to work with different 2D navigation problems as experiments for my reinforcement learning problem. I'm also using stable-baselines3 library to train PPO models. but unfortunately, while training PPO using…
0
votes
1 answer

Optuna HyperbandPruner not pruning?

My study is setup to use the Hyperband pruner with 60 trials, 10M max resource and reduction factor of 2. def optimize_agent(trial): # ... model = PPO("MlpPolicy", env, **params) model.learn(total_timesteps=2000000) study =…
gameveloster
  • 901
  • 1
  • 6
  • 18
0
votes
1 answer

Can't set the frame rate when recording a video with VecVideoRecorder

I have a working RL model and set up that produce a video for me - however becuase the model is reasonably good, the videos are very short (reach a desitination therfore better = shorter) Is there a way to drop the frame rate of the video output? I…
0
votes
1 answer

Is it possible to use a stable-baselines model as the baseline for another model?

I recently trained a stable-baselines PPO model for a couple of days and it is performing well on test environments. Essentially, I am trying to iterate on this model. I was wondering if it is possible to use this model as a new baseline for future…
David3186
  • 73
  • 4
0
votes
1 answer

Pre-Train a Model using imitation learning with Stable-baselines3

I have been trying to figure out a way to Pre-Train a model using Stable-baselines3. In the original documentation for Stable-baseline (the version which runs on Tensorflow 1.X), this seems to be an easy task: from stable_baselines import PPO2 …
0
votes
0 answers

Hey, I am trying to build a custom environment and training an agent on that

at this line: model = PPO("MlpPolicy", env, verbose=1, tensorboard_log=log_path) It raises an error: AttributeError Traceback (most recent call last) Cell In[200], line 2 1 log_path = '/Users/mafaz2/Desktop/open cv…
0
votes
0 answers

Teaching Reinforcement learning Agent on many custom Envionments that perform steps asynchronously

Based on old adobe flash game, I've created custom OpenAI Gym, that's interacting with the game, by reading it's process memory, and performing actions with mouse clicks (game window needs to be visible). I'm able to train agent on one environment,…