So I have a model (A) that I´m training with a custom training procedure. It is supported by a second model (B). This means of course I have to use tf.GradientTape and compute + apply the gradients myself.
However, it doesn´t work as expected and instead as gradients, just a list [None, None ...] is returned. Code snippet:
with tf.GradientTape() as tape:
outputs = model(input_batch, training=True) # output of model A
critic_output = critic_model(outputs, training=True) # output of model B
loss = critic_loss(critic_output, 1) # loss of model B with input generated by A
model_grads = tape.gradient(loss, model.trainable_variables) # returns [None, ...]
The models are correct, I tested every aspect of them. This is not the first time I use Gradient tapes to compute gradients. However, here this time, instead of returning a list of tensors, the gradient call just returns a list of 'None's. I don´t know what´s going on.
I also looked into this other post and added every variable to the gradient tape with .watch(), and it didn´t change anything. All other gradients are computed without problems. I tried pretty much everything and am desperate for help.