I have a keras model with a two-dimensional output (binary classification).
model.output # <tf.Tensor 'dense_1_3/MatMul:0' shape=(?, 2) dtype=float32>
and
model.input # <tf.Tensor 'bidirectional_1_input:0' shape=(?, ?, 200) dtype=float32>
I evaluated three different gradients for some example input of shape (1,50,200)
gradients0 = K.gradients(model.output[:,0] model.inputs)
gradients1 = K.gradients(model.output[:,1], model.inputs)
gradients2 = K.gradients(model.output, model.inputs)
I thought, the first two expressions yield the gradient for the single output neurons and the last one yields a tensor containing the first two expressions.
To my surprise, all three gradients have a shape of (1,50,200)
. In my opinion, gradients2 needs to be of shape (2,50,200)
since model.output
is two dimensional. What is keras computing in this case?