1

I've got a loss function that fulfills my needs, but is only in PyTorch. I need to implement it into my TensorFlow code, but while most of it can trivially be "translated" I am stuck with a particular line:

y_hat[:, torch.arange(N), torch.arange(N)] = torch.finfo(y_hat.dtype).max  # to be "1" after sigmoid

You can see the whole code in following and it is indeed pretty straight forward except for that line:

def get_loss(y_hat, y):
 # No loss on diagonal
 B, N, _ = y_hat.shape
 y_hat[:, torch.arange(N), torch.arange(N)] = torch.finfo(y_hat.dtype).max  # to be "1" after sigmoid

 # calc loss
 loss = F.binary_cross_entropy_with_logits(y_hat, y)  # cross entropy

 y_hat = torch.sigmoid(y_hat)
 tp = (y_hat * y).sum(dim=(1, 2))
 fn = ((1. - y_hat) * y).sum(dim=(1, 2))
 fp = (y_hat * (1. - y)).sum(dim=(1, 2))
 loss = loss - ((2 * tp) / (2 * tp + fp + fn + 1e-10)).sum()  # fscore

return loss

So far I came up with following:

def get_loss(y_hat, y):
 loss = tf.keras.losses.BinaryCrossentropy()(y_hat,y)  # cross entropy (but no logits)


 y_hat = tf.math.sigmoid(y_hat)

 tp = tf.math.reduce_sum(tf.multiply(y_hat, y),[1,2])
 fn = tf.math.reduce_sum((y - tf.multiply(y_hat, y)),[1,2])
 fp = tf.math.reduce_sum((y_hat -tf.multiply(y_hat,y)),[1,2])
 loss = loss - ((2 * tp) / tf.math.reduce_sum((2 * tp + fp + fn + 1e-10)))  # fscore

return loss

so my questions boil down to:

  • What does torch.finfo() do and how to express it in TensorFlow?
  • Does y_hat.dtype just return the data type?

1 Answers1

1

1. What does torch.finfo() do and how to express it in TensorFlow?

.finfo() provides a neat way to get machine limits for floating-point types. This function is available in Numpy, Torch as well as Tensorflow experimental.

.finfo().max returns the largest possible number representable as that dtype.

NOTE: There is also a .iinfo() for integer types.

Here are a few examples of finfo and iinfo in action.

print('FLOATS')
print('float16',torch.finfo(torch.float16).max)
print('float32',torch.finfo(torch.float32).max)
print('float64',torch.finfo(torch.float64).max)
print('')
print('INTEGERS')
print('int16',torch.iinfo(torch.int16).max)
print('int32',torch.iinfo(torch.int32).max)
print('int64',torch.iinfo(torch.int64).max)
FLOATS
float16 65504.0
float32 3.4028234663852886e+38
float64 1.7976931348623157e+308

INTEGERS
int16 32767
int32 2147483647
int64 9223372036854775807

If you want to implement this in tensorflow, you can use tf.experimental.numpy.finfo to solve this.

print(tf.experimental.numpy.finfo(tf.float32))
print('Max ->',tf.experimental.numpy.finfo(tf.float32).max)  #<---- THIS IS WHAT YOU WANT
Machine parameters for float32
---------------------------------------------------------------
precision =   6   resolution = 1.0000000e-06
machep =    -23   eps =        1.1920929e-07
negep =     -24   epsneg =     5.9604645e-08
minexp =   -126   tiny =       1.1754944e-38
maxexp =    128   max =        3.4028235e+38
nexp =        8   min =        -max
---------------------------------------------------------------

Max -> 3.4028235e+38

2. Does y_hat.dtype just return the data type?

YES.

In torch, it would return torch.float32 or something like that. In Tensorflow it would return tf.float32 or something like that.

Akshay Sehgal
  • 18,741
  • 3
  • 21
  • 51
  • thank you. so basically the line sets all diagonal elements to the maximum of the datatype? And the code just returns that value. Since tensorflow seems to be only using tf.float32 can I safely just use tf.float32.max instead of finfo and dtype? – SergeantIdiot Jan 10 '21 at 20:37
  • 1
    correct, if you are sure that dtype is float32. then `tf.float32.max` will give same result. – Akshay Sehgal Jan 10 '21 at 20:38
  • 1
    Setting it to the max value will give a value of 1 over the diagonal after `sigmoid`. I think thats what they are trying to do in the first loss function. – Akshay Sehgal Jan 10 '21 at 20:40
  • So do you know why it wasn't just `y_hat[:, torch.arange(N), torch.arange(N)] = 1` in the PyTorch implementation? – Ivan Jan 10 '21 at 21:24