2

I am trying to implement the marginal loss introduced in the paper [1]. So far this is what I have done.

def marginal_loss(model1, model2, y, margin, threshold):
    margin_ = 1/(tf.pow(margin,2)-margin)
    tmp = (1. - y)
    euc_dist = tf.sqrt(tf.reduce_sum(tf.pow(model1-model2, 2), 1, keep_dims=True))
    thres_dist = threshold - euc_dist
    mul_val = tf.multiply(tmp, thres_dist)
    sum_ = tf.reduce_sum(mul_val)
    return tf.multiply(margin_, sum_)

However, after some epochs, the value goes to nan. I am not sure what mistake I made. Furthermore, I have used 1 instead of epsilon (described in the paper) because its value was not clear. Similarly, the exact threshold value is also not known.

Thanks for any help.

[1] https://ibug.doc.ic.ac.uk/media/uploads/documents/deng_marginal_loss_for_cvpr_2017_paper.pdf

1 Answers1

0

This looks very similar to the problem raised by this other question. The problem may come from the use of tf.sqrt, which has the bad property of having gradients going to infinity as you approach zero, bringing you instabilities as your model converges.

Try to get rid of the tf.sqrt in your loss, for example by minimizing the square of your current loss.

Alternatively you can rely on existing built-in functions like tf.losses.hinge_loss (not for multidimensional outputs though).

P-Gn
  • 23,115
  • 9
  • 87
  • 104
  • Without tf.sqrt, the loss is in -ve. The main problem is not nan, probably due to faulty implementation. It is the implementation. Can you please look if my implementation is correct or not? – Kamran Janjua Jul 17 '18 at 09:41
  • These negative values are probably due to threshold values; I am not sure what exact threshold value is. I set it to 1 (causes negative values), when I change it to 0 (it is normal), the loss decreases in both cases. But I am not sure about the implementation. – Kamran Janjua Jul 17 '18 at 09:47