0

I would like to train with the output of intermediate layers to implement the soft nearest neighbour as a regularizer as stated in https://arxiv.org/pdf/1902.01889.pdf.

Soft nearest neighbour loss equation

Therefore, I have try to implement this via a Gradient tape.

    with tf.GradientTape() as tape:
        # tape.watch(inputs)
        predictions = model(inputs, training=True)
        softnn_loss = 0
        for one_layer_output in intermediate_layers_output:
            ##TODO: get the output of all intermediate layers ???
            softnn_loss += softnn_obj(one_layer_output, labels) 
        pred_loss = loss_fn(labels, predictions)
        total_loss = pred_loss
        total_loss += lamb * softnn_loss
        if len(model.losses) > 0:
            regularization_loss = tf.math.add_n(model.losses)
            total_loss = total_loss + regularization_loss


    gradients = tape.gradient(total_loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))

The solutions from what I have read requires to create a new function for each layer. This would not work as it does not update the gradient accordingly. How can I code this to obtain the output of all the intermediate layer with only one forward pass, such that the gradient will be update correctly to train this particular model?

A way to code obtain the intermediate layers' output with one forward pass

vimuth
  • 5,064
  • 33
  • 79
  • 116

0 Answers0