Highest Voted 'stable-baselines' Questions

0

votes

0 answers

Stablebaselines3 DDPG and TD3 algorithms never finish training

I am building a project utilizing stablebaselines3 to trade stocks I've gone off of this repo https://github.com/notadamking/Stock-Trading-Environment/blob/master/env/StockTradingEnv.py. The environment works just fine for the PPO and A2C algorithms…

reinforcement-learning openai-gym stable-baselines

asked Nov 02 '22 at 19:14

DataD96

301
3
12

0

votes

0 answers

Different observation space Multi-Agent Reinforcement Learning using PettingZoo and SuperSuit

I'm trying to create a Multi-Agent Reinforcement Learning step-up where there are two types of agents. Each with a different type of observation and action space, precisely, two different sizes of images, one for each agent type. I am using…

python stable-baselines multi-agent-reinforcement-learning pettingzoo

asked Nov 02 '22 at 13:17

assaf caftory

1
1

0

votes

0 answers

Gym Environment: How to chose reward properly (for stable_baselines3.PPO for Flappy Bird)

using an openai/gym environment and the PPO algorithm from the stable_baselines3 library I try to build an AI that is able to play Flappy Bird. The problem I am having is that agent always dies at or before the first pipe and never passes it. My…

python reinforcement-learning openai-gym stable-baselines reward-system

asked Oct 12 '22 at 20:31

UnknownInnocent

109
11

0

votes

1 answer

Weird Learning Pattern for Deep Reinforcement Learning using PPO

I'm doing training using Proximal Policy Optimization (PPO) using the package Stable-baselines3 found on Reference 1 below, and I'm facing this weird pattern of learning rate shown below (screenshot 1: Learning Pattern). My action space is…

reinforcement-learning stable-baselines

asked Oct 12 '22 at 19:10

H.A.

1
1

0

votes

1 answer

PPO model performance on Tensorboard vs Reality

I've created multiple environments for stock trading. They are quite simple, I've been using episodes of 11 steps. The observation is something like…

machine-learning reinforcement-learning stable-baselines

asked Oct 10 '22 at 14:58

José Luis Neves

11
2

0

votes

1 answer

Stable baselines3 creates SB3-{date} folders

I am currently using Stable baselines3 A2C. Somehow model.learn() keeps making folders with name of SB3-{current date and time} for each episode. How can I fix this?

stable-baselines

asked Sep 14 '22 at 03:42

Y MG

21
1

0

votes

1 answer

ModuleNotFoundError using Jupyter

I am following a tutorial for Jupyter but I´m getting some errors. The code is in https://github.com/nicknochnack/Reinforcement-Learning-for-Trading-Custom-Signals/blob/main/Custom%20Signals.ipynb At some point it says: from…

python machine-learning stable-baselines

asked Sep 13 '22 at 03:03

Unagi71

1
3

0

votes

0 answers

Multiprocessing in StableBaselines3 - how is batch size distributed?

I am following this multiprocessing notebook. I want to understand how the batch_size parameter of the model is distributed across the multiple environments. I have a model trained with 1 worker on 1 environment with a batch_size = 64, I understand…

reinforcement-learning stable-baselines

asked Sep 12 '22 at 21:13

Vladimir Belik

280
1
12

0

votes

0 answers

How to prevent getting error "AttributeError: module 'cv2' has no attribute 'gapi_wip_gst_GStreamerPipeline" when importing gym library of RL?

Hello I have been trying to learn Reinforcement Learning and I am trying to import the following libraries: import os import gym from stable_baselines3 import PPO from stable_baselines3.common.vec_env import DummyVecEnv from…

python opencv attributeerror openai-gym stable-baselines

asked Sep 12 '22 at 18:09

Sam

65
2
9

0

votes

1 answer

OpenAI Gym CarRacing

I want to create a reinforcement learning model using stable-baselines3 PPO that can drive OpenAI Gym Car racing environment and I have been having a lot of errors and package compatibility issues. I have currently this code just for random…

python reinforcement-learning openai-gym stable-baselines racing

asked Sep 12 '22 at 12:04

brownie

121
9

0

votes

1 answer

How to set total_timesteps in Stable-Baseline, model.learn() function?

I'm using Stable-Baseline to train A2C model. My data length is 9000. So how many total_timesteps in model.learn should I set? model.learn(total_timesteps = 9000) # ? I did some research and some suggest like 10000, and some suggest 1 million. I'm…

tensorflow machine-learning pytorch reinforcement-learning stable-baselines

asked Sep 02 '22 at 05:04

William

3,724
9
43
76

0

votes

1 answer

I am trying to implement PPO from stable baselines3 for my custom environment, I don't understand some commands?

I don't understand the following: env.close() #what does this mean? model.learn(total_timesteps=1000) # are total_steps here the number of steps after which the neural network model parameters are updated (i.e. number of time-steps per…

reinforcement-learning stable-baselines

asked Aug 09 '22 at 09:23

D.g

147
4

0

votes

1 answer

Python stable_baselines3 - AssertionError: The observation returned by `reset()` method must be an int

I am trying to learn reinforcement learning to train ai on custom games in python, and decided to use gym for the environment and stable-baselines3 for the training. I decided to start off with a basic tic tac toe environment. Here's my code import…

python reinforcement-learning openai-gym stable-baselines

asked Aug 02 '22 at 01:43

wolfrevokcats

5
2

0

votes

1 answer

Does "deterministic = True" make sense in box, multi-binary or multi-discrete environments?

Using Stable Baselines 3: Given that deterministic=True always returns the action with the highest probability, what does that mean for environments where the action space is "box", "multi-binary" or "multi-discrete" where the agent is supposed to…

reinforcement-learning policy deterministic stable-baselines

asked Jul 22 '22 at 20:53

GHE

75
4

0

votes

1 answer

stable baslines3 runtime error: cannot infer dtype of numpy.float32

from stable_baselines3 import A2C model=A2C('MlpPolicy',env,verbose=1) model.learn(total_timesteps=10000) I am using this on CartPole-v1 env=gym.make('CartPole-v1') And I am getting RuntimeError: Could not infer dtype of numpy.float32

python machine-learning reinforcement-learning stable-baselines

asked Jul 17 '22 at 13:57

Meet Rajput

1
1

Questions tagged [stable-baselines]