Why does the CartPole-v0 reset after 200 steps?

Question

I was working on CartPole-v0 provided by openai gym. I noticed that my program always resets after 200 steps. If I sum all the rewards from an episode, where the maximum reward is 1.0 for each timestep, I never get more than 200. I was wondering if there is any configuration I might have missed in the gymlibrary gym. Has anybody found this problem?

score 9 · Answer 1 · answered Jun 04 '18 at 08:37

9

CartPole-v0 gives a reward of 1.0 for every step your agent is "alive".

The environment is registered with these lines of code:

register(
    id='CartPole-v0',
    entry_point='gym.envs.classic_control:CartPoleEnv',
    max_episode_steps=200,
    reward_threshold=195.0,
)

which, in the current version of the repository, can be found here.

That max_episode_steps=200 means that an episode automatically terminates after 200 steps. So, the maximum score you can get is 200.

answered Jun 04 '18 at 08:37

Dennis Soemers

8,090
2
32
55

3

I've solved with **`env._max_episode_steps = 500`** as found [here](https://github.com/openai/gym/issues/463#issuecomment-389873434). Calling **`env.reset()`** will also reset the score, so you may wanto to write a **`def env_reset():`** function as well. – Avio Apr 24 '19 at 15:38

Why does the CartPole-v0 reset after 200 steps?

1 Answers1