3

It's been days that I've been struggling just to simply view layers' gradients in the debug mode of Keras2. Needless to say, I have already tried codes such as:

import Keras.backend as K
gradients = K.gradients(model.output, model.input)
sess = tf.compat.v1.keras.backend.get_session()
evaluated_gradients = sess.run(gradients, feed_dict={model.input:images})

or

evaluated_gradients = sess.run(gradients, feed_dict{model.input.experimantal_ref():images})

or

with tf.compat.v1.Session(graph=tf.compat.v1.keras.backend.get_default_graph())

or similar approaches using

tf.compat.v1

which all lead to the following error:

RuntimeError: The Session graph is empty. Add operations to the graph before calling run().

I assume this should be the most basic tool any deep learning package could provide, it is strange why there seems no easy way to do so in Keras2. Any ideas?

  • You are mixing `keras` and `tf.keras`, that usually doesn't work. Choose one of the two and stick with it. – Daniel Möller Feb 05 '20 at 20:33
  • @DanielMöller but keras2 alone doesn't have set_session either and it won't allow me to see inside of tensors. How should I do so? – Ghazal Sahebzamani Feb 05 '20 at 21:24
  • Then choose `tf.keras` instead of `keras`. – Daniel Möller Feb 05 '20 at 21:27
  • If you are looking into tensors with data (eager mode on), then you don't need to touch the sessions, just go with `tf.keras.gradients` directly. You can also try the gradient tape. There are examples of the tape in this custom training loop: https://www.tensorflow.org/tutorials/customization/custom_training_walkthrough – Daniel Möller Feb 05 '20 at 21:29

1 Answers1

2

You can try to do this on TF 2 with eager mode on.

Please notice that you need to use tf.keras for everything, including your model, layers, etc. For this to work you can never use keras alone, it must be tf.keras. This means, for instance, using tf.keras.layers.Dense, tf.keras.models.Sequential, etc..

input_images_tensor = tf.constant(input_images_numpy)
with tf.GradientTape() as g:
    g.watch(input_images_tensor)
    output_tensor = model(input_images_tensor)

gradients = g.gradient(output_tensor, input_images_tensor)

If you are going to calculate the gradients more than once with the same tape, you need the tape to be persistent=True and delete it manually after you get the gradients. (See details on the link below)

You can get the gradients regarding any "trainable" weight without needing watch. If you are going to get gradients with respect to non-trainable tensors (such as the input images), then you must call g.watch as above for each of these variables).

More details on GradientTape: https://www.tensorflow.org/api_docs/python/tf/GradientTape

Daniel Möller
  • 84,878
  • 18
  • 192
  • 214