0

I am trying to reproduce the EuclideanLoss from Caffe in Tensorflow. I found a function called: tf.nn.l2_loss which according to the documents computes the following:

output = sum(t ** 2) / 2

When looking at the EuclideanLoss in the Python version of caffe it says:

def forward(self, bottom, top):
        self.diff[...] = bottom[0].data - bottom[1].data
        top[0].data[...] = np.sum(self.diff**2) / bottom[0].num / 2.

In the original docu it says:

To me this is exactly the same computation. However, my loss values for the same net in Tensorflow are around 3000 and in Caffe they are at roughly 300. So where is the difference?

  • I would say that you need to divide also by the batch size (10?), or even better, use `tf.mean()` to calculate the average loss of the batch. – Manolo Santos Jul 11 '17 at 07:47
  • Hm even better is not what I asked for. I am asking for the exact same loss?! @ManoloSantos –  Jul 11 '17 at 08:56
  • It is the same loss. `tf.reduce_mean(x) / 2. == tf.sum(x) / x.shape[0] / 2.` – Manolo Santos Jul 11 '17 at 09:15
  • Okay could you answer the question and write down the exact tensor flow loss to use? I am a bit confused. @ManoloSantos –  Jul 11 '17 at 09:53

1 Answers1

1

tf.nn.l2_loss does not take into account the batch size in order to calculate the loss. In order to get the same value as caffe, you should divide by the batch size. In order to do so, the easiest way is to use the mean (sum / n):

import tensorflow as tf

y_pred = tf.constant([1, 2, 3, 4], tf.float32)
y_real = tf.constant([1, 2, 4, 5], tf.float32)
mse_loss = tf.reduce_mean(tf.square(y_pred - y_real)) / 2.

sess = tf.InteractiveSession()
mse_loss.eval()
Manolo Santos
  • 1,915
  • 1
  • 14
  • 25
  • I am 100% sure this is not equal to the EuclideanLoss from caffe since I have been using this formula already! And it returns values in range [0, 1] since my ground truth values range from [0, 1] that is due to the mean function. The mean returns the "mean" rather than a sum. It is not correct. Could you have a look at this link [caffe euclideanloss](http://caffe.berkeleyvision.org/doxygen/classcaffe_1_1EuclideanLossLayer.html) –  Jul 11 '17 at 11:04
  • @thigi. Mmmh, I think you are missing the definition of mean. `mean(x) = (1 / N) * sum(x)` – Manolo Santos Jul 11 '17 at 11:16
  • Is it possible that your targets (y) are vectors instead of scalars? In that case, it is different. You have to sum only to the last axis (-1) and divide by the batch size – Manolo Santos Jul 11 '17 at 11:40
  • Yes they are are like this: `tf.placeholder(tf.float32, shape=[1, 64, 64, 54])`. Could you update your answer? –  Jul 12 '17 at 07:09