Stable Baselines is a library with implementations of various reinforcement learning algorithms in Python, developed by OpenAI. Please mention the exact version of Stable Baselines that is being used in the body of the question.
Questions tagged [stable-baselines]
277 questions
0
votes
0 answers
Stablebaselines3 DDPG and TD3 algorithms never finish training
I am building a project utilizing stablebaselines3 to trade stocks I've gone off of this repo https://github.com/notadamking/Stock-Trading-Environment/blob/master/env/StockTradingEnv.py.
The environment works just fine for the PPO and A2C algorithms…

DataD96
- 301
- 3
- 12
0
votes
0 answers
Different observation space Multi-Agent Reinforcement Learning using PettingZoo and SuperSuit
I'm trying to create a Multi-Agent Reinforcement Learning step-up where there are two types of agents.
Each with a different type of observation and action space, precisely, two different sizes of images, one for each agent type.
I am using…

assaf caftory
- 1
- 1
0
votes
0 answers
Gym Environment: How to chose reward properly (for stable_baselines3.PPO for Flappy Bird)
using an openai/gym environment and the PPO algorithm from the stable_baselines3 library I try to build an AI that is able to play Flappy Bird. The problem I am having is that agent always dies at or before the first pipe and never passes it. My…

UnknownInnocent
- 109
- 11
0
votes
1 answer
Weird Learning Pattern for Deep Reinforcement Learning using PPO
I'm doing training using Proximal Policy Optimization (PPO) using the package Stable-baselines3 found on Reference 1 below, and I'm facing this weird pattern of learning rate shown below (screenshot 1: Learning Pattern).
My action space is…

H.A.
- 1
- 1
0
votes
1 answer
PPO model performance on Tensorboard vs Reality
I've created multiple environments for stock trading. They are quite simple, I've been using episodes of 11 steps. The observation is something like…

José Luis Neves
- 11
- 2
0
votes
1 answer
Stable baselines3 creates SB3-{date} folders
I am currently using Stable baselines3 A2C. Somehow model.learn() keeps making folders with name of SB3-{current date and time} for each episode. How can I fix this?

Y MG
- 21
- 1
0
votes
1 answer
ModuleNotFoundError using Jupyter
I am following a tutorial for Jupyter but I´m getting some errors.
The code is in
https://github.com/nicknochnack/Reinforcement-Learning-for-Trading-Custom-Signals/blob/main/Custom%20Signals.ipynb
At some point it says:
from…

Unagi71
- 1
- 3
0
votes
0 answers
Multiprocessing in StableBaselines3 - how is batch size distributed?
I am following this multiprocessing notebook. I want to understand how the batch_size parameter of the model is distributed across the multiple environments.
I have a model trained with 1 worker on 1 environment with a batch_size = 64, I understand…

Vladimir Belik
- 280
- 1
- 12
0
votes
0 answers
How to prevent getting error "AttributeError: module 'cv2' has no attribute 'gapi_wip_gst_GStreamerPipeline" when importing gym library of RL?
Hello I have been trying to learn Reinforcement Learning and I am trying to import the following libraries:
import os
import gym
from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import DummyVecEnv
from…

Sam
- 65
- 2
- 9
0
votes
1 answer
OpenAI Gym CarRacing
I want to create a reinforcement learning model using stable-baselines3 PPO that can drive OpenAI Gym Car racing environment and I have been having a lot of errors and package compatibility issues.
I have currently this code just for random…

brownie
- 121
- 9
0
votes
1 answer
How to set total_timesteps in Stable-Baseline, model.learn() function?
I'm using Stable-Baseline to train A2C model.
My data length is 9000. So how many total_timesteps in model.learn should I set?
model.learn(total_timesteps = 9000) # ?
I did some research and some suggest like 10000, and some suggest 1 million. I'm…

William
- 3,724
- 9
- 43
- 76
0
votes
1 answer
I am trying to implement PPO from stable baselines3 for my custom environment, I don't understand some commands?
I don't understand the following:
env.close() #what does this mean?
model.learn(total_timesteps=1000) # are total_steps here the number of steps after which the neural network model parameters are updated (i.e. number of time-steps per…

D.g
- 147
- 4
0
votes
1 answer
Python stable_baselines3 - AssertionError: The observation returned by `reset()` method must be an int
I am trying to learn reinforcement learning to train ai on custom games in python, and decided to use gym for the environment and stable-baselines3 for the training. I decided to start off with a basic tic tac toe environment. Here's my code
import…

wolfrevokcats
- 5
- 2
0
votes
1 answer
Does "deterministic = True" make sense in box, multi-binary or multi-discrete environments?
Using Stable Baselines 3:
Given that deterministic=True always returns the action with the highest probability, what does that mean for environments where the action space is "box", "multi-binary" or "multi-discrete" where the agent is supposed to…

GHE
- 75
- 4
0
votes
1 answer
stable baslines3 runtime error: cannot infer dtype of numpy.float32
from stable_baselines3 import A2C
model=A2C('MlpPolicy',env,verbose=1)
model.learn(total_timesteps=10000)
I am using this on CartPole-v1
env=gym.make('CartPole-v1')
And I am getting
RuntimeError: Could not infer dtype of numpy.float32

Meet Rajput
- 1
- 1