1

I am training a stable_baselines3 PPO agent and want to perform some task on every step. To do this, I'm using a callback CustomCallback with _on_step method defined.

But it appears that _on_step is called only on every PPO.n_steps, so if n_steps param is 1024, then CustomCallback._on_step appears to be called only on every 1024 steps.

How can you do something on every 1 step, insted of on every PPO.n_steps steps?

from stable_baselines3 import PPO
from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3.common.callbacks import BaseCallback

class CustomCallback(BaseCallback):
    def __init__(self, freq, verbose=0):
        super().__init__(verbose)
        self.freq = freq

    def _on_step(self):
        if self.n_calls % self.freq == 0:
            print('do something')
        return True

env = make_vec_env("CartPole-v1", n_envs=1)
model = PPO("MlpPolicy", env, n_steps=1024)
model.learn(
    total_timesteps=25000,
    callback=CustomCallback(freq=123),
)
gameveloster
  • 901
  • 1
  • 6
  • 18

0 Answers0