Error processing event with use of ray's PPO algorithm

Question

I am using the PPO algorithm - provided by ray - to train an RL agent to stabilize traffic. During the training process, I keep seeing ValueError('Observation outside expected value range', Box(500,) screenshot

However, I don't know which part of my script is causing this issue or if it is caused by flow at all ?

score 0 · Answer 1 · answered Oct 14 '19 at 23:25

0

Oof yes that's a very small bug caused by the RLlib upgrade. Basically, the Ray version we used to use wasn't strict about the bounds of the observation space being restricted, but the new version of Ray does. You can fix this by going into the corresponding environment and changing the low and high values of the observation space to be slightly more permissive (say, -2 to 2 instead of the current -1 to 1)

answered Oct 14 '19 at 23:25

Eugene Vinitsky

56
1
3

What do you mean by the corresponding environment ? Are you referring to the make_create_env() function that creates the environment ? – Isaac Oct 15 '19 at 01:35

Error processing event with use of ray's PPO algorithm

1 Answers1