How can I specify the interval between 2 consecutive checkpoints in tensorflow? There are no options in tf.train.Saver
to specify that. Every time, I run the model with a different number of global steps, I get a new interval between checkpoints

- 5,179
- 10
- 35
- 56

- 101
- 1
- 3
2 Answers
The tf.train.Saver
is a "passive" utility for writing checkpoints, and it only writes a checkpoint when some other code calls its .save()
method. Therefore, the rate at which checkpoints are written depends on what framework you are using to train your model:
If you are using the low-level TensorFlow API (
tf.Session
) and writing your own training loop, you can simply insert calls toSaver.save()
in your own code. A common approach is to do this based on the iteration count:for i in range(NUM_ITERATIONS): sess.run(train_op) # ... if i % 1000 == 0: saver.save(sess, ...) # Write a checkpoint every 1000 steps.
If you are using
tf.train.MonitoredTrainingSession
, which writes checkpoints for you, you can specify a checkpoint interval (in seconds) in the constructor. By default it saves a checkpoint every 10 minutes. To change this to every minute, you would do:with tf.train.MonitoredTrainingSession(..., save_checkpoint_secs=60): # ...

- 125,488
- 26
- 399
- 400
Thanks!
This fixed my problem:
tf.contrib.slim.learning.train(
train_op,
checkpoint_dir,
log_every_n_steps=args.log_every_n_steps,
graph=g,
global_step=model.global_step,
number_of_steps=args.number_of_steps,
init_fn=model.init_fn,
save_summaries_secs=300,
save_interval_secs=300,
saver=saver)

- 101
- 1
- 3