For example, y=Ax
where A
is an diagonal matrix, with its trainable weights (w1, w2, w3
) on the diagonal.
A = [w1 ... ...
... w2 ...
... ... w3]
How to create such trainable A
in Tensorflow or Keras?
If I try A = tf.Variable(np.eye(3))
, the total number of trainable weights would be 3*3=9, not 3. Because I only want to update (w1,w2,w3) that 3 weights.
A trick may be to use A = tf.Variable([1, 1, 1]) * np.eye(3)
, so that the 3 trainable weights are mapped into the diagonal of A
.
My question is:
Would that trick work for my purpose? Would the gradient be correctly calculated?
What if the situation of
A
is more complicated? E.g. if I want to create:
where the w1, w2, ..., w6
are weights to be updated.