0

So I have a model (A) that I´m training with a custom training procedure. It is supported by a second model (B). This means of course I have to use tf.GradientTape and compute + apply the gradients myself.

However, it doesn´t work as expected and instead as gradients, just a list [None, None ...] is returned. Code snippet:

with tf.GradientTape() as tape:
  outputs = model(input_batch, training=True)  # output of model A
  critic_output = critic_model(outputs, training=True)  # output of model B
  loss = critic_loss(critic_output, 1)  # loss of model B with input generated by A
  model_grads = tape.gradient(loss, model.trainable_variables)  # returns [None, ...]

The models are correct, I tested every aspect of them. This is not the first time I use Gradient tapes to compute gradients. However, here this time, instead of returning a list of tensors, the gradient call just returns a list of 'None's. I don´t know what´s going on.

I also looked into this other post and added every variable to the gradient tape with .watch(), and it didn´t change anything. All other gradients are computed without problems. I tried pretty much everything and am desperate for help.

Marie M.
  • 170
  • 1
  • 3
  • 13
  • I wonder... since the "loss" is based on "critic_output", maybe you have to add "critic_model.trainable_variables" to the "tape.gradient" call? – Mark Lavin Aug 22 '21 at 15:48
  • What @MarkLavin said. Maybe even replace model.trainable_variables with critic_model.trainable_variables? Alternatively, and this is just a guess, you could try training the models under separate GradientTapes? – kelkka Aug 23 '21 at 17:16

0 Answers0