0

I'm implementing a Convolutional Neural Network in Tensorflow with python. I'm in the following scenario: I've got a tensor of labels y (batch labels) like this:

y =   [[0,1,0]
       [0,0,1]
       [1,0,0]]

where each row is a one-hot vector that represents a label related to the correspondent example. Now in training I want stop loss gradient (set to 0) of the example with that label (the third):

       [1,0,0]

which rappresents the n/a label, instead the loss of the other examples in the batch are computed. For my loss computation I use a method like that:

self.y_loss = kl_divergence(self.pred_y, self.y)

I found this function that stop gradient, but how can apply it to conditionally to the batch elements?

Alberto Merciai
  • 474
  • 1
  • 5
  • 17
  • Did you switch the arguments for the `kl_divergence()` function? I think that confused me when I wrote my answer. – Styrke Apr 22 '17 at 09:18

1 Answers1

2

If you don't want some samples to contribute to the gradients you could just avoid feeding them to the network during training at all. Simply remove the samples with that label from your training set.

Alternatively, since the loss is computed by summing over the KL-divergences for each sample, you could multiply the KL-divergence for each sample with either 1 if the sample should be taken into account and 0 otherwise before summing over them. You can get the vectors of values you need to multiply the individual KL-divergences with by subtracting the first column of the tensor of labels from 1: 1 - y[:,0]

For the kl_divergence function from the answer to your previous question it might look like this:

def kl_divergence(p, q) 
    return tf.reduce_sum(tf.reduce_sum(p * tf.log(p/q), axis=1)*(1-p[:,0]))

where p is the groundtruth tensor and q are the predictions

Community
  • 1
  • 1
Styrke
  • 2,606
  • 1
  • 21
  • 17