0

Question

How to load data into placeholder for once, and then run multiple computations on data in the placeholder?

Use case

  • I have 100 numpy arrays (A1, ..., A100) with the same shape.
  • The objective function depends on both the input data and an array of variables B.
  • For example, the loss function for A1 can be `loss_1 = np.sum(A_1) + np.sum(B_1).
  • For each An, I want to find the array of variables Bn that minimize the corresponding loss function lossn.
  • The result should be 100 arrays of variables, which are: B1,...,B100

I want to load A1, find B1, and then repeat for the rest of the A arrays.

Attempt 1

Loading the A arrays with tf.constant would lead to out of memory. After I load A1 and find B1. When I load A2, A1 will still stay in the memory of the GPU. After a while, the program will use up all memory of the GPU.

Attempt 2

Use placeholder, and load the same data in every step of the minimization. That will be slow because transferring data to the GPU is slow.

import tensorflow as tf

x = tf.placeholder("float", None)
y = x * 2

with tf.Session() as session:
    for j in range(100):  # Iterate over the 100 A arrays
        A_n = [1, 2, 3]  # The 100 different A arrays.
        for i in range(300):  # Steps of one minimization
            # Why do I have to feed the same data for 300 times??
            result = session.run(y, feed_dict={x: A_n})
            print(result)
hamster on wheels
  • 2,771
  • 17
  • 50

1 Answers1

1

This can be achieved by converting x to a variable. Variables in TF 1.X are initialized by explicitly running their initializer via Session.run. Therefore, all you need is to initialize the variable x with a placeholder:

x_init = tf.placeholder(tf.float32, shape=(3, ))
x = tf.Variable(x_init)
y = x * 2

with tf.Session() as sess:
    for j in range(100):
        A_n = [j, j + 1, j + 2]
        # Reinitialize `x` with the new A_n.
        sess.run(x.initializer, feed_dict={x_init: A_n})
        # `x` is initialized and therefore there is nothing to feed.
        for i in range(300):
            result = sess.run(y)
            print(result)

Note that this assumes that the shapes of A_n are the same.

Sergei Lebedev
  • 2,659
  • 20
  • 23
  • Can't use `tf.global_variables_initializer()` with this method. That means I have to manually initialize the internal variables of optimizer. And I run into this issue: https://github.com/tensorflow/tensorflow/issues/8057 – hamster on wheels Jan 03 '19 at 23:18
  • I did not suggest to use `tf.global_variables_initializer()`. The snippet in the answer only re-initializes `x`. – Sergei Lebedev Jan 04 '19 at 10:42
  • That is true. Your answer works without `tf.global_variables_initializer()`. But I need to reuse an optimizer to avoid out of memory error. If I don't use `tf.global_variables_initializer()`, the optimizer behaves very strangely. I asked another question here: https://stackoverflow.com/questions/54043382/reuse-adamoptimizer-and-avoid-strange-behavior – hamster on wheels Jan 04 '19 at 17:33
  • Found a solution: `session.run(tf.global_variables_initializer(), feed_dict=...)` But reusing the optimizer still leads to strange behavior. – hamster on wheels Jan 04 '19 at 18:06
  • alternatives: https://stackoverflow.com/questions/34220532/how-to-assign-a-value-to-a-tensorflow-variable – hamster on wheels Jan 23 '19 at 16:20
  • but this method might be better. constant initializer is quite broken in tensorflow, at least in 2017: https://github.com/tensorflow/tensorflow/issues/13433 Using assign in a loop wastes memory: https://github.com/tensorflow/tensorflow/issues/4151 Placeholder looks like the only sensible choice here. – hamster on wheels Jan 25 '19 at 14:49