I have a trainer that already has a stop trigger based on the total number of epochs:
trainer = training.Trainer(updater, (epochs, 'epoch'))
Now I would like to add a stopping condition based on the total elapsed time starting from some point in the code (which may be different than the elapsed_time stored inside trainer
):
global_start = time.time()
# Some long preprocessing
expensive_processing()
# Trainer starts here and its internal elapsed time
# does not take into account the preprocessing
trainer.run()
What I tried is to define an extension as follows.
trainer.global_start = global_start
trainer.global_elapsed_time = 0.0
def stop_training():
return True
def check_time_limit(my_trainer):
my_trainer.global_elapsed_time = time.time() - my_trainer.global_start
# If reach the time limit, then set the stop_trigger as a callable
# that is always True
if my_trainer.global_elapsed_time > args.time_limit * 3600:
my_trainer.stop_trigger = stop_training
# Add the extension to trainer
trainer.extend(check_time_limit, trigger=(1000, 'iteration'))
Running the code, I obtained some The previous value of epoch_detail is not saved
error. What did I do wrong?
Thank you so much in advance for your help!