Gradient tape is not giving gradient as expected

Question

I am trying to find the gradient of a simple model where the output is a concatenated vector. But gradient tape is returning a scalar value instead of a vector. I am attaching my code here.

p0=tf.keras.Input(shape=(1,1), name="p0")
X0 = tf.keras.initializers.Constant([0.4])
layer1_p01 = tf.keras.layers.Dense(1, kernel_initializer=X0, activation='linear', use_bias=False, kernel_constraint='NonNeg' )(p0)
layer1_p02 = tf.keras.layers.add([p0, layer1_p01])
layer1_p0 = tf.keras.layers.Concatenate(axis=-1, trainable= False, name="PX1")([layer1_p01, layer1_p02])

#my model
model1 = Model(inputs=[p0,p1,p2,q0,q1],outputs=[layer1_p0])


# Gradient for layer1:
with tf.GradientTape(persistent=True) as tape1:
  tape1.watch(model1.trainable_variables)
  output_model1=model1({'p0':np.array([[[0.5]]]),'p1':np.array([[[0.3]]]),'p2':np.array([[[0.2]]]),'q0':np.array([[[0.5]]]),'q1':np.array([[[0.5]]])})

#print(output_model1)
#print(model1.trainable_variables[0])
my_grad=tape1.gradient(output_model1, model1.trainable_variables)
print(my_grad)

The output is: [<tf.Tensor: shape=(1, 1), dtype=float32, numpy=array([[1.]], dtype=float32)>]

Here, I am computing the gradient of [p0X0 p0+p0X0] with respect to X0. The gradient should be [p0 p0] but the gradient tape is giving p0+p0. How to fix this problem?

score 0 · Answer 1 · answered Feb 22 '22 at 19:00

I solved this problem. tf.gradients 'constructs symbolic derivatives of sum of ys w.r.t. x in xs'. So it is automaticly summing the derivatives w.r.t. same variable. If you want to get the derivatives as a vector you need to use jacobian.

my_grad=tape1.jacobian(output_model1, model1.trainable_variables)

Gradient tape is not giving gradient as expected

1 Answers1