Highest Voted 'stable-baselines' Questions

3

votes

0 answers

Using SubprocVecEnv in SB3 results in "cannot pickle 'weakref' object" error, even though the same vectorized env works for DummyVecEnv. Why?

For multiprocessing I want to use the suggested vec_env_cls, SubprocVecEnv. Using it like the env-line in the given code below or also as vec_enc_cls=SubprocVecEnv in make_vec_env result both in an error of an unpicklable object. def…

error-handling stable-baselines

asked May 04 '23 at 16:05

Stifterson

31
1

3

votes

1 answer

Can't install stable-baselines3[extra]

I am having trouble installing stable-baselines3[extra]. Not sure if I missed installing any dependency to make this work. Machine: Mac M1, Python: Python 3.10.9, pip3: pip 23.0 !pip3 install 'stable-baselines3[extra]' I used the above command to…

reinforcement-learning stable-baselines

asked Feb 11 '23 at 15:59

Codovert

43
1
5

3

votes

0 answers

Stablebaselines3 and Pettingzoo

I am trying to understand how to train agents in a pettingzoo environment using the single agent algorithm PPO implemented in stablebaselines3. I'm following this tutorial where the agents act in a cooperative environment and they are all trained…

python-3.x reinforcement-learning openai-gym stable-baselines pettingzoo

asked Jan 18 '23 at 13:17

Onil90

171
1
8

3

votes

2 answers

Dict Observation Space for Stable Baselines3 Not Working

I've created a minimal reproducible example below, this can be run in a new Google Colab notebook for ease. Once the first install finishes, just Runtime > Restart and Run All for it to take effect. I've made a simple roulette game environment below…

python-3.x openai-gym stable-baselines

asked Oct 01 '22 at 23:13

wildcat89

1,159
16
47

3

votes

1 answer

Reinforcement learning deterministic policies worse than non deterministic policies

We have a custom reinforcement learning environment within which we run a PPO agent from stable baselines3 for a multi action selection problem. The agent learns as expected but when we evaluate the learned policy from trained agents the agents…

reinforcement-learning policy deterministic stable-baselines

asked Jul 15 '22 at 14:41

GHE

75
4

3

votes

0 answers

StableBaselines3 - Why does calling "model.learn(50,000)" twice not give same result as calling "model.learn(100,000)" once?

I am working on a Reinforcement Learning problem in StableBaselines3. I am trying to understand why this code: model = MaskablePPO(MaskableActorCriticPolicy, env, verbose=1, learning_rate=0.0003, gamma=0.975, seed=10, batch_size=256,…

reinforcement-learning stable-baselines

asked Jul 02 '22 at 16:46

Vladimir Belik

280
1
12

3

votes

1 answer

Stable Baselines3 PPO() - how to change clip_range parameter during training?

I want to gradually decrease the clip_range (epsilon, exploration vs. exploitation parameter) throughout training in my PPO model. I have tried to simply run "model.clip_range = new_value", but this doesn't work. In the docs here , it says…

reinforcement-learning stable-baselines

asked Jun 03 '22 at 01:12

Vladimir Belik

280
1
12

3

votes

2 answers

RL + optimization: how to do it better?

I am learning about how to optimize using reinforcement learning. I have chosen the problem of maximum matching in a bipartite graph as I can easily compute the true optimum. Recall that a matching in a graph is a subset of the edges where no two…

python optimization graph-theory reinforcement-learning stable-baselines

asked Apr 29 '22 at 19:28

byteful

309
1
8

3

votes

0 answers

Learning rate scheduler in DQN within stable_baselines3

I'm experimenting with Reinforcement Learning using gym and stable-baselines3, particularly using the DQN implementation of stable-baselines3 for the MountainCar (https://gym.openai.com/envs/MountainCar-v0/). I'm trying to implement a learning rate…

python pytorch openai-gym stable-baselines

asked Aug 10 '21 at 18:11

David Méndez Martínez

31
1

3

votes

0 answers

problem with adding logic for invalid moves in openai gym and stable-baselines

I want to integrate my environment into the openAI gym and then use stable baselines library for training it. link to stable baseline: https://stable-baselines.readthedocs.io/ The learning method in the stable baseline is with one-line learning and…

reinforcement-learning openai-gym stable-baselines

asked Mar 30 '20 at 12:09

Saeid Ghafouri

163
6

2

votes

2 answers

Stable Baselines 3 support for Farama Gymnasium

I am building an environment in the maintained fork of gym: Gymnasium by Farama. In my gym environment, I state that the action_space = gym.spaces.Discrete(5) and the observation_space = gym.spaces.MultiBinary(25). Running the environment with the…

python reinforcement-learning openai-gym stable-baselines

asked Mar 24 '23 at 10:49

Lexpj

921
2
6
15

2

votes

0 answers

Why using GPU in Stable Baselines 3 is slower than using cpu?

When training the "CartPole" environment with Stable Baselines 3 using PPO, I get that training the model using cuda GPU is almost twice as slow as training the model with just the cpu (both in google colab and in local). I thought using cuda for…

python reinforcement-learning stable-baselines

asked Feb 13 '23 at 04:11

Joel

21
2

2

votes

1 answer

Why `ep_rew_mean` much larger than the reward evaluated by the `evaluate_policy()` fuction

I write a custom gym environment, and trained with PPO provided by stable-baselines3. The ep_rew_mean recorded by tensorboard is as follow: the ep_rew_mean curve for total 100 million steps, each episode has 50 steps As shown in the figure, the…

reinforcement-learning stable-baselines policy-gradient-descent

asked Feb 06 '23 at 08:40

Aramiis

21
2

2

votes

1 answer

No GL context; create a Window first

When I ran the Stable Baselines3 RL Colab Notebooks, an error occurred. stable_baselines_getting_started.ipynb record_video('CartPole-v1', model, video_length=500, prefix='ppo-cartpole') GLException Traceback (most…

google-colaboratory openai-gym stable-baselines

asked Jan 22 '23 at 00:34

Kei TSUKAMOTO

21
3

2

votes

1 answer

AssertionError: The algorithm only supports as action spaces but Box(-1.0, 1.0, (3,), float32) was provided

So basically I tried converting this custom gym environment from https://github.com/Gor-Ren/gym-jsbsim to use farama foundation's gymnasium api. This is my repo whih I am working on: https://github.com/sryu1/jsbgym When I try training the…

python artificial-intelligence reinforcement-learning openai-gym stable-baselines

asked Jan 13 '23 at 12:04

sryu1

45
1
6

Questions tagged [stable-baselines]