15

I want to know the difference between make_initializable_iterator and make_one_shot_iterator.
1. Tensorflow documentations said that A "one-shot" iterator does not currently support re-initialization. What exactly does that mean?
2. Are the following 2 snippets equivalent?
Use make_initializable_iterator

iterator = data_ds.make_initializable_iterator()
data_iter = iterator.get_next()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for e in range(1, epoch+1):
    sess.run(iterator.initializer)
    while True:
        try:
            x_train, y_train = sess.run([data_iter])
            _, cost = sess.run([train_op, loss_op], feed_dict={X: x_train,
                                                               Y: y_train})
        except tf.errors.OutOfRangeError:   
            break
sess.close()

Use make_one_shot_iterator

iterator = data_ds.make_one_shot_iterator()
data_iter = iterator.get_next()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for e in range(1, epoch+1):
    while True:
        try:
            x_train, y_train = sess.run([data_iter])
            _, cost = sess.run([train_op, loss_op], feed_dict={X: x_train,
                                                               Y: y_train})
        except tf.errors.OutOfRangeError:   
            break
sess.close()
Lion Lai
  • 1,862
  • 2
  • 20
  • 41

1 Answers1

12

Suppose you want to use the same code to do your training and validation. You might like to use the same iterator, but initialized to point to different datasets; something like the following:

def _make_batch_iterator(filenames):
    dataset = tf.data.TFRecordDataset(filenames)
    ...
    return dataset.make_initializable_iterator()


filenames = tf.placeholder(tf.string, shape=[None])
iterator = _make_batch_iterator(filenames)

with tf.Session() as sess:
    for epoch in range(num_epochs):

        # Initialize iterator with training data
        sess.run(iterator.initializer,
                 feed_dict={filenames: ['training.tfrecord']})

        _train_model(...)

        # Re-initialize iterator with validation data
        sess.run(iterator.initializer,
                 feed_dict={filenames: ['validation.tfrecord']})

        _validate_model(...)

With a one-shot iterator, you can't re-initialize it like this.

Scott Smith
  • 3,900
  • 2
  • 31
  • 63
  • can you also explain what is the difference between initializable and reinitializable iterators? – Nima Apr 30 '18 at 13:20
  • Can we "reload" dataset? For example: features, labels = trainDataset.make_one_shot_iterator().get_next(), graph = fn(features, labels). Then after training, re-load features, labels = TestDataset.xxx().get_next? Since I presume it's a different dataset instead of re-initializing – Leighton Sep 05 '18 at 01:03
  • @Leighton this would probably mean you create additional dataset graph, which is not what you usually want – Alex Kreimer Apr 19 '20 at 08:28