1

Does anyone know how can I do checkpointing and saving the model for algorithm-Trainer models in ray-rllib?

I know that that is available for ray.tune, but it seems that it is not directly possible to do so for the rllib algorithms.

Afshin Oroojlooy
  • 1,326
  • 3
  • 21
  • 43

1 Answers1

1

The trainer class has a save_checkpoint method as well as a load_checkpoint one.

 @override(Trainable)
def save_checkpoint(self, checkpoint_dir: str) -> str:
    checkpoint_path = os.path.join(
        checkpoint_dir, "checkpoint-{}".format(self.iteration)
    )
    pickle.dump(self.__getstate__(), open(checkpoint_path, "wb"))

    return checkpoint_path

@override(Trainable)
def load_checkpoint(self, checkpoint_path: str) -> None:
    extra_data = pickle.load(open(checkpoint_path, "rb"))
    self.__setstate__(extra_data)
Jose
  • 11
  • 1