About the tf.contrib.data.Dataset
(from TensorFlow 1.2, see here and here) usage:
When I use repeat
(for multiple epochs) together with shuffle
(as read_batch_features
does internally), how will I notice when some epochs ends, and what the current epoch is? Also, when the epoch ends, will the ShuffleDataset
wait first to dequeue everything or will it already be filled with more data from the next epoch? In the last epoch, or if I don't use repeat
, will the ShuffleDataset
dequeue all remaining data, like tf.RandomShuffleQueue
dequeueing does after close?
My current solution, which also gives me more control: I would not use repeat
but go once over the data and use ShuffleDataset
to get shuffling like RandomShuffleQueue
, and then at some point I get OutOfRangeError
and I know that I reached the end of the epoch. Then I reinitializable the iterator, like it is described here.