OpenAI gym: when is reset required?

Question

Although I can manage to get the examples and my own code to run, I am more curious about the real semantics / expectations behind OpenAI gym API, in particular Env.reset()

When is reset expected/required? At the end of each episode? Or only after creating an environment?

I rather think it makes sense before each episode but I have not been able to read that explicitly!

score 6 · Accepted Answer · edited Nov 07 '19 at 07:36

You typically use reset after an entire episode. So that could be after you reached a terminal state in the mdp, or after you reached you maximum amount of time steps (set by you). I also typically reset it at the very start of training as well.

So if you are at your starting state 'A' and you want to reach state 'Z', you would run your time steps going from 'A' -> 'B' -> 'C' ..., then when you reach the terminal state 'Z', you start a new episode using reset, which would take you back to 'A'.

    for episode in range(iterations):
        state = env.reset() // first state
        for time_step in range(1000):  //max amount of iterations
            action = take_action(state)
            state, reward, done, _ = env.step(action)
            if done:
                break // takes you to the next episode where the environment is reset

score 0 · Answer 2 · answered Oct 08 '22 at 15:41

0

Thing simply by using env.reset() it just reset whole things so you need to reset each episode

This is example for reset function inside a custom environment. It just reset the enemy position and time in this case

I guess you got better understanding by showing what is inside environment

Sorry for late response

answered Oct 08 '22 at 15:41

Thiroshan Suthakar

117
2
12

Please [do not post images of code](https://meta.stackoverflow.com/questions/285551/why-should-i-not-upload-images-of-code-data-errors-when-asking-a-question) - the same holds true for answers. – MagnusO_O Oct 14 '22 at 15:33

OpenAI gym: when is reset required?

2 Answers2