1

I have a trainer that already has a stop trigger based on the total number of epochs:

trainer = training.Trainer(updater, (epochs, 'epoch'))

Now I would like to add a stopping condition based on the total elapsed time starting from some point in the code (which may be different than the elapsed_time stored inside trainer):

global_start = time.time()
# Some long preprocessing
expensive_processing()
# Trainer starts here and its internal elapsed time
# does not take into account the preprocessing
trainer.run()

What I tried is to define an extension as follows.

trainer.global_start = global_start
trainer.global_elapsed_time = 0.0

def stop_training():
    return True

def check_time_limit(my_trainer):
    my_trainer.global_elapsed_time = time.time() - my_trainer.global_start
    # If reach the time limit, then set the stop_trigger as a callable
    # that is always True
    if my_trainer.global_elapsed_time > args.time_limit * 3600:
        my_trainer.stop_trigger = stop_training

# Add the extension to trainer
trainer.extend(check_time_limit, trigger=(1000, 'iteration'))

Running the code, I obtained some The previous value of epoch_detail is not saved error. What did I do wrong?

Thank you so much in advance for your help!

Sophil
  • 223
  • 1
  • 9

0 Answers0