I am training a stable_baselines3
PPO
agent and want to perform some task on every step. To do this, I'm using a callback CustomCallback
with _on_step
method defined.
But it appears that _on_step
is called only on every PPO.n_steps
, so if n_steps
param is 1024
, then CustomCallback._on_step
appears to be called only on every 1024
steps.
How can you do something on every 1
step, insted of on every PPO.n_steps
steps?
from stable_baselines3 import PPO
from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3.common.callbacks import BaseCallback
class CustomCallback(BaseCallback):
def __init__(self, freq, verbose=0):
super().__init__(verbose)
self.freq = freq
def _on_step(self):
if self.n_calls % self.freq == 0:
print('do something')
return True
env = make_vec_env("CartPole-v1", n_envs=1)
model = PPO("MlpPolicy", env, n_steps=1024)
model.learn(
total_timesteps=25000,
callback=CustomCallback(freq=123),
)