2

I have the following (shortened) code I am trying to run:

coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)

try:
   while not coord.should_stop():      

      # Run some code.... (Reading some data from file 1)

      coord_dev = tf.train.Coordinator()
      threads_dev = tf.train.start_queue_runners(sess=sess, coord=coord_dev)

      try:
        while not coord_dev.should_stop():

           # Run some other code.... (Reading data from file 2)

      except tf.errors.OutOfRangeError:
        print('Reached end of file 2')
      finally:
        coord_dev.request_stop()
        coord_dev.join(threads_dev) 

except tf.errors.OutOfRangeError:
   print('Reached end of file 1')
finally:
   coord.request_stop()
   coord.join(threads)

What is supposed to happen above is that:

  • File 1 is a csv file including training data for my neural network.
  • File 2 includes dev set data.

While iterating over File 1 during training, I occasionally want to calculate cost an accuracy on dev set data (from File 2) as well. But when the inner loop finishes reading File 2, it obviously triggers the exception

"tf.errors.OutOfRangeError"

which causes my code to leave the outer loop as well. The exception of inner loop simply handled as the exception of outer loop too. But after finishing reading the File 2, I want my code continue training over File 1 in the outer loop.

(I have removed some details like num_epochs to train etc to simplify the readibility of the code)

Does any one have any suggestion regarding how to solve this problem? I am a bit new in this.

Thank you in advance!

edn
  • 1,981
  • 3
  • 26
  • 56

1 Answers1

2

Solved.

Apparently, using queue_runners is not the right way of doing this. Tensorflow documentation indicates that dataset api should be used instead, which took its time to understand. The below code does what I was trying to do previously. Sharing here in case other people may need it as well.

I have put some additional training code under www.github.com/loheden/tf_examples/dataset api. I struggled a bit to find complete examples.

# READING DATA FROM train and validation (dev set) CSV FILES by using INITIALIZABLE ITERATORS

# All csv files have same # columns. First column is assumed to be train example ID, the next 5 columns are feature
# columns, and the last column is the label column

# ASSUMPTIONS: (Otherwise, decode_csv function needs update)
# 1) The first column is NOT a feature. (It is most probably a training example ID or similar)
# 2) The last column is always the label. And there is ONLY 1 column that represents the label.
#    If more than 1 column represents the label, see the next example down below

feature_names = ['f1','f2','f3','f4','f5']
record_defaults = [[""], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0]]


def decode_csv(line):
   parsed_line = tf.decode_csv(line, record_defaults)
   label =  parsed_line[-1]      # label is the last element of the list
   del parsed_line[-1]           # delete the last element from the list
   del parsed_line[0]            # even delete the first element bcz it is assumed NOT to be a feature
   features = tf.stack(parsed_line)  # Stack features so that you can later vectorize forward prop., etc.
   #label = tf.stack(label)          #NOT needed. Only if more than 1 column makes the label...
   batch_to_return = features, label
   return batch_to_return

filenames = tf.placeholder(tf.string, shape=[None])
dataset5 = tf.data.Dataset.from_tensor_slices(filenames)
dataset5 = dataset5.flat_map(lambda filename: tf.data.TextLineDataset(filename).skip(1).map(decode_csv))
dataset5 = dataset5.shuffle(buffer_size=1000)
dataset5 = dataset5.batch(7)
iterator5 = dataset5.make_initializable_iterator()
next_element5 = iterator5.get_next()

# Initialize `iterator` with training data.
training_filenames = ["train_data1.csv", 
                      "train_data2.csv"]

# Initialize `iterator` with validation data.
validation_filenames = ["dev_data1.csv"]

with tf.Session() as sess:
    # Train 2 epochs. Then validate train set. Then validate dev set.
    for _ in range(2):     
        sess.run(iterator5.initializer, feed_dict={filenames: training_filenames})
        while True:
            try:
              features, labels = sess.run(next_element5)
              # Train...
              print("(train) features: ")
              print(features)
              print("(train) labels: ")
              print(labels)  
            except tf.errors.OutOfRangeError:
              print("Out of range error triggered (looped through training set 1 time)")
              break

    # Validate (cost, accuracy) on train set
    print("\nDone with the first iterator\n")

    sess.run(iterator5.initializer, feed_dict={filenames: validation_filenames})
    while True:
        try:
          features, labels = sess.run(next_element5)
          # Validate (cost, accuracy) on dev set
          print("(dev) features: ")
          print(features)
          print("(dev) labels: ")
          print(labels)
        except tf.errors.OutOfRangeError:
          print("Out of range error triggered (looped through dev set 1 time only)")
          break  
edn
  • 1,981
  • 3
  • 26
  • 56