0

I understand that so long as i am defining a computation in tf.GradientTape() context, the gradient tape would compute the gradient w.r.t all the variables that the output of the computation depends on. However, i think i am not quite grasping the subtelties of the gradient as the following code does not execute as i expect it to:

import tensorflow as tf
x = tf.Variable(2.)
loss_ = x**2-2*x+1
with tf.GradientTape(persistent=True) as g:
    loss = loss_*1
print(g.gradient(loss,x))
output: None

Why is the gradient wrt x not computed?

I am able to compute only the gradients that are wrt to the variables that are getting explicitly used within the context. for example the following code does not compute gradients as well:

import tensorflow as tf
tf.compat.v1.disable_eager_execution()
x = tf.Variable(2.)
t1 = x**2
t2 = -2*x
t3 = 1.
with tf.GradientTape(persistent=True) as g:
    loss = t1+t2+t3
print(g.gradient(loss,x))
figs_and_nuts
  • 4,870
  • 2
  • 31
  • 56
  • You need the computation of `loss_` inside the tape context. – xdurch0 Jun 01 '20 at 11:22
  • but im computing `loss` inside the context and i am computing gradinet of `loss` not `loss_` .. ?? what exactly is the way to do it exactly? – figs_and_nuts Jun 01 '20 at 11:26
  • The entire computation from start to finish needs to be inside the scope. Things that happen outside the scope are not traced. As your code is right now, the tape knows that `loss` comes from `loss_` but it has no idea that `loss_` comes from `x`, and so no gradients can be computed. – xdurch0 Jun 01 '20 at 13:45

1 Answers1

1

The GradientTape object g goes out of scope after the with statement ends.

In other words, try printing the gradient inside the with statement.

Here's what works for me:

def get_gradients(inputs, target, model):
    with tf.GradientTape() as tape:
        prediction = model(inputs)
        loss = loss_object(target, prediction)
        gradient = tape.gradient(loss, model.trainable_variables)

    return gradient
asaf92
  • 1,557
  • 1
  • 19
  • 30
  • no it does not. In fact i get the warning: WARNING:tensorflow:Calling GradientTape.gradient on a persistent tape inside its context is significantly less efficient than calling it outside the context (it causes the gradient ops to be recorded on the tape, leading to increased CPU and memory usage). Only call GradientTape.gradient inside the context if you actually want to trace the gradient in order to compute higher order derivatives. None – figs_and_nuts Jun 01 '20 at 11:09
  • OK so my second guess it that maybe your loss is defined incorrectly. Maybe try to define it using Keras.backend operations (as well as the variable itself) – asaf92 Jun 01 '20 at 11:15
  • 1
    Also check my example: https://colab.research.google.com/drive/18NMvNdoAn1Ip9GmFt6A_Qvqd3jY1honP?usp=sharing – asaf92 Jun 01 '20 at 11:17
  • @asaf92 Your example is nice, but it seems that 1) you need a `modelGradients.set_weights(modelFit.get_weights())` in order to initialize the models with the same weights 2) for larger inputs and outputs you need the inputs and outputs to be two-dimensional (otherwise updates don't seem to match) 3) we must ensure that `fit` and `train` operate on exactly the same batches. – NNN Oct 28 '22 at 05:25