Stable Baselines is a library with implementations of various reinforcement learning algorithms in Python, developed by OpenAI. Please mention the exact version of Stable Baselines that is being used in the body of the question.
Questions tagged [stable-baselines]
277 questions
1
vote
1 answer
gym 0.21 + stable_baseline3 TypeError: tuple indices must be integers or slices, not str
I'm trying to train a stable_baseline3 model on my custom gym environment. The training crashes with a TypeError on the first step.
Using cuda device
Traceback (most recent call last):
File "train_agent.py", line 12, in
…

gameveloster
- 901
- 1
- 6
- 18
1
vote
1 answer
How to correctly define this Observation Space for the custom Gym environment I am creating using Gym.Scpaces.Box?
I am trying to implement DDPG algorithm of the Paper.
Here in the image below, gk[n] and rk[n] are KxM matrices of real values.
Theta[n] and v[n] are arrays of size M.
I want to write correct code to specify state/observation space in my custom…

Sukhamjot Singh
- 35
- 6
1
vote
0 answers
stable_baselines3 best observation space for custom environment
I'm newbie in RL and I'm learning stable_baselines3. I've create simple 2d game, where we want't to catch as many as possible falling apples. If we don't catch apple, apple disappears and we loose a point, else we gain 1 point. We can move only left…

Rozrewolwerowany rewolwer
- 172
- 1
- 10
1
vote
0 answers
Should I cast and normalize my discrete observation space for stable baselines 3 custom environments?
I read Antonin Raffin's SB3 RL Tips and Tricks and I am wondering if I should use a Box observation space and normalize or discrete observation space.
I have a toy problem where my observations are a sequence of 10 scores that have all lower bound 0…

Olli
- 906
- 10
- 25
1
vote
1 answer
Training PPO from stable_baselines3 on a grid world that randomizes
I'm new to RL and I was hoping to get some advice from yol:
I created a custom environment that is a 10x10 grid world where the agent and its target destination (as well as some obstacles, namely: Fires) can be randomly placed. The state of the env…

Pete
- 11
- 3
1
vote
1 answer
gym RL with MultiDiscrete ActionSpace AttributeError: 'MultiDiscrete' object has no attribute 'spaces'
I'm trying to build an Reinforcement Learning Algorithm, which can play the MasterMind Game. I'm using an MultiDiscrete Anction and Observation Space. The Action Space takes 4 slots with 6 colors each and the Observation Space is 2x4. I created an…

AR_Jini
- 11
- 3
1
vote
1 answer
How to access training metrics with a custom logger in Stable Baselines 3?
With stable baselines 3 it is possible to access metrics and info of the environment by using self.training_env.get_attr("your_attribute_name"), however, how does one access the training metrics that are generated by the model?
By setting verbose=1…

rich
- 520
- 6
- 21
1
vote
1 answer
How to register custom environment with OpenAI's gym package to use make_vec_env() in SB3 (for multiprocessing)?
Goal: In Stable Baselines 3, I want to be able to run multiple workers on my environment in parallel (multiprocessing) to train my model.
Method: As shown in this Google Colab, I believe I just need to run the below line of code:
vec_env =…

Vladimir Belik
- 280
- 1
- 12
1
vote
0 answers
How to obtain gradients of Stable Baselines 3 models while training?
I currently use Stable Baselines 3 to train a reinforcement learning SAC agent on a custom environment. I would like to plot the mean of all calculated gradients of the model while it is training. There are some unexpected drops in the performance…

Nair von Muehlenen
- 11
- 2
1
vote
0 answers
The batch_size meaning of the on-policy Reinforcement Learning in stable-baselines3?
I use stable-baselines3 to train a model of PPO, I am questionable for the param "batch-size", does on-policy learning has the definition of the batch-size? My teacher taught me the PPO will be update itself when finished a episode, what's the…

Horivici
- 21
- 2
1
vote
1 answer
How can I change n_steps while using stable baselines3 (PPO implementation)?
I am implementing PPO from stable baselines3 for my custom environment. Right now n_steps = 2048, so the model update happens after 2048 time-steps. How can I change this, I want my model to update after n_steps = 1000?

D.g
- 147
- 4
1
vote
1 answer
More metrics in Tensorboard
I create a custom environment for an example of trading bot (RL).
During the training I wanted to check results by using of TensorBoard, but what I see are only a few metrics, in particular only :
-----------------------------------------
| time/ …

GTF
- 125
- 10
1
vote
0 answers
A bug is encountered with the "Maskable PPO" training with custom Env setup
I encountered an error while using SB3-contrib Maskable PPO action masking algorithm.
File ~\anaconda3\lib\site-packages\sb3_contrib\common\maskable\distributions.py:231, in MaskableMultiCategoricalDistribution.apply_masking(self, masks)
228 masks =…

Godfrey Leung
- 11
- 1
1
vote
1 answer
Is there any "better way" to train a quadruped to walk using reinforcement Learning
I have been chasing this problem of using RL to train a quadruped to walk. But have got NO noteworthy success. Following are the Details of the GYM ENV I am using.
Sim: pybullet
env.action_space = box(shape=(12,), upper = 1, lower =…

Shivam Chavan
- 11
- 2
1
vote
0 answers
Stable Baselines 3 - BaseFeaturesExtractor weights are not updated
I am working with Stable-Baselines 3 and want to create a custom feature extractor. Unfortunately, the weights of the layers (here: "self.cnn" and "self.linear") are not updated by backpropagation.
Here is my CustomCNN(BaseFeaturesExtractor)…

Giwdul
- 13
- 1
- 4