I would like to compute the hessian of a loss function of a neural network in Tensorflow with respect to all the parameters (or trainable variables). By modifying the example code from the Tensorflow documentation (https://www.tensorflow.org/api_docs/python/tf/GradientTape) I managed to compute the hessian w.r.t the weight matrix for the first layer (if I'm not mistaken):
with tf.GradientTape(persistent=True) as tape:
loss = tf.reduce_mean(model(x,training=True)**2)
g = tape.gradient(loss,model.trainable_variables[0])
h=tape.jacobian(g,model.trainable_variables[0])
If I try to compute it w.r.t model.trainable_variables instead the tape.jacobian complains that 'list object has no attribute shape'. I instead tried to flatten the model.trainable_variables and compute it w.r.t the flattened vector:
with tf.GradientTape(persistent=True) as tape:
loss = tf.reduce_mean(model(x,training=True)**2)
source = tf.concat([tf.reshape(x,[-1]) for x in model.trainable_variables],axis=0)
g = tape.gradient(loss,source)
h=tape.jacobian(g,source)
The problem now is that g is empty (NoneType) for some reason. I noticed that source is tf.Tensor-type but model.trainable_variables[0] was of type tf.ResourceVariable so I tried changing this by declaring source as
source = resource_variable_ops.ResourceVariable(tf.concat([tf.reshape(x,[-1]) for x in model.trainable_variables],axis=0))
This didn't change anything though, so I'm guessing that this is not the issue. I also thought that the problem might be that the source-variable is not watched, but it seems that it is set to trainable and even if i do tape.watch(source), g is still empty.
Does anybody know how I can solve this?