How to compute gradient of output wrt input in Tensorflow 2.0

Question

I have a trained Tensorflow 2.0 model (from tf.keras.Sequential()) that takes an input layer with 26 columns (X) and produces an output layer with 1 column (Y).

In TF 1.x I was able to calculate the gradient of the output with respect to the input with the following:

model = load_model('mymodel.h5')
sess = K.get_session()
grad_func = tf.gradients(model.output, model.input)
gradients = sess.run(grad_func, feed_dict={model.input: X})[0]

In TF2 when I try to run tf.gradients(), I get the error:

RuntimeError: tf.gradients is not supported when eager execution is enabled. Use tf.GradientTape instead.

In the question In TensorFlow 2.0 with eager-execution, how to compute the gradients of a network output wrt a specific layer?, we see an answer on how to calculate gradients with respect to intermediate layers, but I don't see how to apply this to gradients with respect to the inputs. On the Tensorflow help for tf.GradientTape, there are examples with calculating gradients for simple functions, but not neural networks.

How can tf.GradientTape be used to calculate the gradient of the output with respect to the input?

Does this answer your question? [In TensorFlow 2.0 with eager-execution, how to compute the gradients of a network output wrt a specific layer?](https://stackoverflow.com/questions/56478454/in-tensorflow-2-0-with-eager-execution-how-to-compute-the-gradients-of-a-networ) — curiouscupcake, Dec 02 '19 at 19:13
@LongNguyen - No, it doesn't. I've already linked to that answer in my question and explained why it doesn't answer it. — maurera, Dec 02 '19 at 19:22
This should answer your questions. But this is for a very simple function. However the comments should guide you on how you can adapt this to a neural network. If you still have trouble, let me know so I can edit the answer to suit your question. https://stackoverflow.com/questions/35226428/how-do-i-get-the-gradient-of-the-loss-at-a-tensorflow-variable/58314728#58314728 — thushv89, Dec 02 '19 at 19:37
@thushv89 - Thanks for the link. I've looked through that example, but haven't been successful trying to adapt it to my question. I've tried: with tf.GradientTape() as tape: preds = model(model.input) dy_dx = tape.gradient(preds,tf.convert_to_tensor(X)) but this gives the error: "tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimensions must be equal, but are 2 and 0 for 'Equal_1' (op: 'Equal') with input shapes: [2], [0]." — maurera, Dec 02 '19 at 21:20

score 8 · Accepted Answer · answered Dec 02 '19 at 22:40

8

This should work in TF2:

inp = tf.Variable(np.random.normal(size=(25, 120)), dtype=tf.float32)

with tf.GradientTape() as tape:
    preds = model(inp)

grads = tape.gradient(preds, inp)

Basically you do it the same way as TF1, but using GradientTape.

answered Dec 02 '19 at 22:40

Dr. Snoopy

55,122
7
121
140

3

Thanks! This works (I just had to change the input shape to be "size=(120,26)" since I have 26 input columns). The crux was using tf.Variable() to convert the data (X) from numpy to a tf variable (inp). I had tried tf.convert_to_tensor(), but this didn't work. – maurera Dec 02 '19 at 23:01
1

Do you know 1) Why do you need to use tf.Variable() rather than inputting a numpy array directly? 2) Why do you call model(inp) rather than model.predict(inp)? (What's the different between model(X) and model.predict(X)?) – maurera Dec 02 '19 at 23:48
4

I have no idea about variables, but model(x) and model.predict(x) do not do the same, predict works with numpy arrays, while model(x) does a symbolic computation that tensorflow can differentiate. – Dr. Snoopy Dec 02 '19 at 23:50
do you know how to compute every grad in a (x,y)-batch,but not use for loop?use for loop is too slow. – Song Mar 27 '20 at 03:28
1

@Song Please ask your own question with all the details. – Dr. Snoopy Mar 27 '20 at 09:32

thushv89 · Answer 2 · 2019-12-02T23:03:38.283

1

I hope this is what you're looking for. This will give the gradients of the output w.r.t. the inputs.

# Whatever the input you like goes in as the initial_value
x = tf.Variable(np.random.normal(size=(25, 120)), dtype=tf.float32)
y_true = np.random.choice([0,1], size=(25,10))

print(model.output)
print(model.predict(x))
with tf.GradientTape() as tape:
  pred = model.predict(x)

grads = tape.gradients(pred, x)

edited Dec 02 '19 at 23:03

answered Dec 02 '19 at 22:12

thushv89

10,865
1
26
39

Thanks for the post. Is this for TF2? When I try this (changing x and y_true to match my data), tf.gradients() results in the error "RuntimeError: tf.gradients is not supported when eager execution is enabled. Use tf.GradientTape instead." I suspect that rather than using tf.gradients(), we need to use tape.gradient(). – maurera Dec 02 '19 at 22:24
Edited to use `tape.gradient`. – thushv89 Dec 02 '19 at 23:05
do you know how to compute every grad in a (x,y)-batch,but not use for loop?use for loop is too slow. – Song Mar 27 '20 at 03:28
@Song, can you give an example. If I'm not mistaken, grads will be of size whatever your `x` is (meaning you'd have gradients for each sample in the batch). And how did you use a for loop? – thushv89 Mar 29 '20 at 22:21
@thushv89 yes,the gradtape returns one gradient even x contains many examples,sometimes i need to compute every grad for every example in a mini-batch.then i can clip the gradient the grad. It seem that the gradtape always return only one grad,so it seems there is not good idea except a for loop. so does pytorch. – Song Apr 01 '20 at 05:54

score 0 · Answer 3 · answered Jul 01 '21 at 01:04

In the above case, we should use tape.watch()

for (x, y) in test_dataset:
    with tf.GradientTape() as tape:
        tape.watch(x)

        pred = model(x)

grads = tape.gradient(pred, x)

but the grads will just have the grads of the inputs

The following method is better, you can use model to predict the prediction results and compute the loss, then use the loss to calculate the grads of all trainable variables

with tf.GradientTape() as tape:
    predictions = model(x, training=True)
    loss = loss_function(y, predictions)
grads = tape.gradient(loss, model.trainable_variables)

How to compute gradient of output wrt input in Tensorflow 2.0

3 Answers3

Linked