How to train a RL agent in multiple episodes

Question

how can I create a RL agent that has to perform on ex 1000 different episodes of 200 time steps each? Using gym-anytrading and stable-baselines3

This isn't the type of question for SO. It's very broad and seeking recommendations. Do you have a specific problem with your code? — Michael Ruth, Dec 08 '22 at 19:42
I don't know how to feed the environment with df shape (n_episodes, n_timesteps, n_feautures) — user9085964, Dec 08 '22 at 21:17

score 1 · Answer 1 · answered Dec 12 '22 at 16:32

Something like this? You could also encapsulate your maximum steps number into your done flag within the environment's step method.

 # define your env and model above
 episodes = 1000
 for ep in range(1, episodes+1):
    state = env.reset()
    done = False
    score = 0
    step = 0

    while step < 200 and not done:
        action = model.predict(state)
        state, reward, done, _ = env.step(action)
        score += reward
        step += 1
    print (f"Episode {ep} is finished at {step} step with a score {score}")

score 1 · Answer 2 · answered Dec 21 '22 at 09:47

While i can not provide exact code example since i can't see your code i can tell you approach i took on my project. You can check after a step execution if its terminal state meaning if an agent reach the goal and you can count steps and check if its above your threshold. How i did that in my code was:

while episode_counter < training_episode:
#initate your agents here
            while not (agent.is_terminal() or otherAgent.is_terminal() or anotherAgent.is_terminal()):
            #  agent execute step

Hope This would help. My code was with multiple agent but you can use same approach for single agent environments.

How to train a RL agent in multiple episodes

2 Answers2