Ray RLlib is an open-source Python library for Reinforcement Learning. Use with applicable framework tags, such as TensorFlow or PyTorch.
Questions tagged [rllib]
105 questions
2
votes
0 answers
Training ray.rllib algorithm with vectorized environments
I am working with ray.rllib and I am stuck with using a static method (line40) of vectorizing my custom environment and training it using PPOTrainer(). I use the existing_envs parameter and hand over a list of gym.envs which I manually create. Is…

sai lalith Polawar
- 21
- 3
2
votes
1 answer
Using graph-mode in Ray RLlib triggers errors when a tf.keras.model.predict() function is called in PPOTFPolicy
I'm using Ray RLlib to train a PPO agent with TWO modifications on the PPOTFPolicy.
I added a mixin class (say "Recal") to the "mixins" parameter in "build_tf_policy()". This way, the PPOTFPolicy would subclass my "Recal" class and have the access…

WPXP
- 21
- 1
2
votes
1 answer
Ray[RLlib] Custom Action Distribution (TorchDeterministic)
We know that in the case of a Box (continuous action) Action Space, the corresponding Action Distribution is DiagGaussian (probability distribution).
However, I want to use TorchDeterministic (Action Distribution that returns the input values…

Paketto
- 21
- 2
2
votes
0 answers
The actor died unexpectedly before finishing this task ( Ray1.7.0 , Sagemaker )
I am running Ray rllib on sagemaker with 8 cores CPU using the sagemaker_rl library, I set num_workers to 7.
After a long execution I face The actor died unexpectedly before finishing this task
class MyLauncher(SageMakerRayLauncher):
def…

Amir Reza SH
- 21
- 4
2
votes
0 answers
Using RLlib for a custom multi-agent gym environment
I am trying to set up a custom multi-agent environment using RLlib, but either I am using the once available online or I am making one, I am being encountered by the same errors as mentioned below. Please help me out.
I have installed whatever they…

SURABHI SHARMA
- 37
- 1
- 6
2
votes
0 answers
Trained well with DQN, but not learning with A2C
I've used Ray RLlib's DQN to train in my custom simulator. It usually produced good results after 15 million steps.
After playing around with DQN for a while, I'm now trying to train A2C in the simulator. However, it's not even close to converging…

Kai Yun
- 97
- 8
2
votes
3 answers
Save episode rewards in ray.tune
I am training several agents with PPO algorithms in a multi-agent environment using rllib/ray. I am using the ray.tune() command to train the agents and then loading the training data from ~/ray_results. This data contains the actions chosen by the…

mat123a
- 21
- 1
2
votes
1 answer
Error: `callbacks` must be a callable method that returns a subclass of DefaultCallbacks, got
when I run some codes(DDPG - Deep Deterministic Policy Gradient), this error occurred: ValueError: callbacks must be a callable method that returns a subclass of DefaultCallbacks, got
my code is…

Peter Kim
- 65
- 4
2
votes
0 answers
Ray RLlib: Why is the learn throughput decreasing in DQN Training?
Is it normal to have the learn throughput decrease and the learn time increase as a Dueling DDQN agent gets trained?
A 7X increase in learn time after a few hours of training is pretty significant, is this something that you will expect to see? My…

Nyxynyx
- 61,411
- 155
- 482
- 830
2
votes
1 answer
AWS SageMaker RL with ray: ray.tune.error.TuneError: No trainable specified
I have a training script based on the AWS SageMaker RL example rl_network_compression_ray_custom but changed the env to make a basic gym env Asteroids-v0 (installing dependencies at main entrypoint to the training script). When I run the fit on the…

MorRich
- 426
- 2
- 5
- 15
2
votes
0 answers
Cannot define custom metrics in ray
I'm using a framework called FLOW RL. It enables me to use rllib and ray for my RL algorithm. I have been trying to plot non learning data on tensorboard. Following ray documentation ( link ), I have tried to add custom metrics. Therefore, I need to…

Antonio Domínguez
- 51
- 3
2
votes
1 answer
How do we print action distributions in RLlib during training?
I'm trying to print action distributions at the end of each episode to see what my agent is doing. I've attempted to do put this is rock_paper_scissors_multiagent.py by including the following method
def on_episode_end(info):
episode =…

Charlie Hou
- 83
- 6
2
votes
1 answer
Atari score vs reward in rllib DQN implementation
I'm trying to replicate DQN scores for Breakout using RLLib. After 5M steps the average reward is 2.0 while the known score for Breakout using DQN is 100+. I'm wondering if this is because of reward clipping and therefore actual reward does not…

Shital Shah
- 63,284
- 17
- 238
- 185
1
vote
0 answers
Implement PPO agent using RAY on custom Gym environment
I've been interested in Reinforcement Learning for a while. I've made a simple PPO agent, using Stable Baselines 3, PyTorch and Gym to trade Bitcoin using Binance data. This was relatively easy and straight forward to setup, get it running and get a…

Max van kekeren
- 11
- 2
1
vote
1 answer
How to end episodes after 200 steps in Ray Tune (tune.run()) using a PPO model with torch
I'm using the following code to import a custom environement and then train on it:
from ray.tune.registry import register_env
import ray
from ray import air, tune
from ray.rllib.algorithms.ppo import PPO
from gym_env.cube_env import…

Dawid
- 13
- 2