1

In a simple code below, gradient gets computed correctly.

import tensorflow as tf

x = tf.constant([1, 2, 3, 4], dtype=tf.float32)
y = tf.Variable(tf.ones_like(x), dtype=tf.float32)

y = 2*x
grad = tf.gradients(y, x)

ini = tf.global_variables_initializer()


with tf.Session() as ses:
    ses.run(ini)
    print(ses.run(grad))

The result, as expected, is [array([ 2., 2., 2., 2.], dtype=float32)]. I get into problem when trying to use tf.assign for function computation. The below code:

import tensorflow as tf

x = tf.constant([1, 2, 3, 4], dtype=tf.float32)
y = tf.Variable(tf.ones_like(x), dtype=tf.float32)

func = tf.assign(y, 2*x)
grad = tf.gradients(y, x)

ini = tf.global_variables_initializer()

with tf.Session() as ses:
    ses.run(ini)
    ses.run(func)
    print(ses.run(grad))

... yields an error:

TypeError: Fetch argument None has invalid type <class 'NoneType'>.

Why is that so? Is the connection between x and y node somehow "lost" via the tf.assign operation?

Maxim
  • 52,561
  • 27
  • 155
  • 209
Broono
  • 81
  • 6

1 Answers1

1

In the second example, there's no dependency between x and y. func is an op that depends on both and happens to modify y. If you inspect the corresponding tf.assign op, you'll see:

op: "Assign"
input: "Variable"   # this is y
input: "mul"        # this is 2*x

But x and y are independent, that's why the engine fails to take the gradient.

Maxim
  • 52,561
  • 27
  • 155
  • 209
  • Maxim, thank you for your answer! Can I ask for one more clarification? In case of gradient computation, should I typically expect a function not to be a graph node, just like in my first piece of code above? That is a bit counter-intuitive, as I was assuming that the graph should hold all the key computations... It gets me a bit confused what should and what should not be in the graph. – Broono Jan 13 '18 at 18:15
  • @Broono in terms of tensorflow, each node is an *operation* (short: op). The variable is also a set of ops: read/write/init. What you think of a *function* is a higher-level abstraction. So it's better to say it this way: assign op is a graph node, and you can run it. – Maxim Jan 13 '18 at 18:18
  • I see... apparently I need to work on mapping of logical flow on graph architecture. Thank you for your answers! – Broono Jan 13 '18 at 18:22