Does anyone know how can I do checkpointing and saving the model for algorithm-Trainer models in ray-rllib?
I know that that is available for ray.tune
, but it seems that it is not directly possible to do so for the rllib algorithms.
Does anyone know how can I do checkpointing and saving the model for algorithm-Trainer models in ray-rllib?
I know that that is available for ray.tune
, but it seems that it is not directly possible to do so for the rllib algorithms.
The trainer class has a save_checkpoint method as well as a load_checkpoint one.
@override(Trainable)
def save_checkpoint(self, checkpoint_dir: str) -> str:
checkpoint_path = os.path.join(
checkpoint_dir, "checkpoint-{}".format(self.iteration)
)
pickle.dump(self.__getstate__(), open(checkpoint_path, "wb"))
return checkpoint_path
@override(Trainable)
def load_checkpoint(self, checkpoint_path: str) -> None:
extra_data = pickle.load(open(checkpoint_path, "rb"))
self.__setstate__(extra_data)