Stable Baselines is a library with implementations of various reinforcement learning algorithms in Python, developed by OpenAI. Please mention the exact version of Stable Baselines that is being used in the body of the question.
Questions tagged [stable-baselines]
277 questions
0
votes
1 answer
StableBaselines3 / PPO / model rollsout but does not learn?
When a model learns there is:
A rollout phase
A learning phase
My models are rolling out but they never show a learning phase. This is apparent both in the text output in a jupyter Notebook in vscode as well as in tensorboard.
I built a very…

user3533030
- 359
- 3
- 17
0
votes
1 answer
What is missing in Gym registration?
I get
ValueError: xxx not found in gym registry, you maybe meant
when trying to register a custom environment in stable baselines 3. I tried the following command:
python train.py --algo tqc --env donkey-mountain-track-v0 --eval-freq -1…

Mohamed Esmael
- 1
- 1
0
votes
1 answer
How can I improve this Reinforced Learning scenario in Stable Baselines3?
In this scenario, I present a box observation with numbers 0, 1 or 2 and shape (1, 10).
The odds for 0 and 2 are 2% each, and 96% for 1.
I want the model to learn to pick the index of any 2 that comes. If it doesn't have a 2, just choose 0.
Bellow…

Edhowler
- 715
- 8
- 17
0
votes
1 answer
stable-baseline3, gym, train while also step/predict
With stable-baselines3 given an agent, we can call "action = agent.predict(obs)". And then with Gym, this would be "new_obs, reward, done, info = env.step(action)". (more or less, maybe missed an input or an output).
We also have…

Oren
- 976
- 9
- 23
0
votes
2 answers
Accessing training metrics in stable-baselines3
Is it possible to access the A2C total loss and whether the environment truncated or terminated within a custom callback?
I'd like to access truncated and terminated in _on_step. That would allow me to terminate training when the environment…

Alpine Chamois
- 43
- 4
0
votes
1 answer
Stable-Baselines make_vec_env() not calling the wrapper kwargs as expected
I have a custom gym environment where I am implementing action masking. The code below works well
from toyEnv_mask import PortfolioEnv # file with my custom gym environment
from sb3_contrib import MaskablePPO
from sb3_contrib.common.wrappers import…

Generalenthu
- 107
- 1
- 7
0
votes
0 answers
Problem registering custom gym: gym.error.UnregisteredEnv
I created a custom gym called BazEnv for use with stable_baselines3 but I cannot seem to register it properly.
I get the error
gym.error.UnregisteredEnv: No registered env with id: foo_bar_envs/BazEnv-v0
when I try to create the custom gym…

gameveloster
- 901
- 1
- 6
- 18
0
votes
0 answers
How to rescale the output of PPO model to the range of the action space in PPO?
Stable baselines3 PPO implementation using MLP policy uses two hidden layers with 64 nodes each. On setting my gym environment, I had set my action space in the range [-50,50]. However, there seem to be no such bounds on the model output in the PPO…

Manav Mishra
- 103
- 1
- 6
0
votes
1 answer
Use MiniGrid environment with stable-baseline3
I'm using MiniGrid library to work with different 2D navigation problems as experiments for my reinforcement learning problem. I'm also using stable-baselines3 library to train PPO models. but unfortunately, while training PPO using…

user19826638
- 31
- 1
- 4
0
votes
1 answer
Optuna HyperbandPruner not pruning?
My study is setup to use the Hyperband pruner with 60 trials, 10M max resource and reduction factor of 2.
def optimize_agent(trial):
# ...
model = PPO("MlpPolicy", env, **params)
model.learn(total_timesteps=2000000)
study =…

gameveloster
- 901
- 1
- 6
- 18
0
votes
1 answer
Can't set the frame rate when recording a video with VecVideoRecorder
I have a working RL model and set up that produce a video for me - however becuase the model is reasonably good, the videos are very short (reach a desitination therfore better = shorter)
Is there a way to drop the frame rate of the video output? I…

TimNewman
- 13
- 4
0
votes
1 answer
Is it possible to use a stable-baselines model as the baseline for another model?
I recently trained a stable-baselines PPO model for a couple of days and it is performing well on test environments. Essentially, I am trying to iterate on this model. I was wondering if it is possible to use this model as a new baseline for future…

David3186
- 73
- 4
0
votes
1 answer
Pre-Train a Model using imitation learning with Stable-baselines3
I have been trying to figure out a way to Pre-Train a model using Stable-baselines3.
In the original documentation for Stable-baseline (the version which runs on Tensorflow 1.X), this seems to be an easy task:
from stable_baselines import PPO2
…

Lord-Goku
- 1
- 4
0
votes
0 answers
Hey, I am trying to build a custom environment and training an agent on that
at this line:
model = PPO("MlpPolicy", env, verbose=1, tensorboard_log=log_path)
It raises an error:
AttributeError Traceback (most recent call last)
Cell In[200], line 2
1 log_path = '/Users/mafaz2/Desktop/open cv…
0
votes
0 answers
Teaching Reinforcement learning Agent on many custom Envionments that perform steps asynchronously
Based on old adobe flash game, I've created custom OpenAI Gym, that's interacting with the game, by reading it's process memory, and performing actions with mouse clicks (game window needs to be visible). I'm able to train agent on one environment,…

Filip Młynarski
- 3,534
- 1
- 10
- 22