1

I want to write a function with two arguments, A and B, tensors of the same shape (for example, 13x13, or some other shape), and that returns a number that represents the summation of all losses when applied binary cross-entropy componentwise. So, for A[i, j] and B[i, j] we find the binary cross-entropy loss, and then sum over all i and j. How to implement that in Keras and Tensorflow?

nbro
  • 15,395
  • 32
  • 113
  • 196
Alem
  • 283
  • 1
  • 13

2 Answers2

1

The solution suggested in this answer may actually not be what you (reader) are looking for.

If A and B are NxM, where M > 1, then binary_crossentropy(A, B) will not compute the binary cross-entropy element-wise, but binary_crossentropy(A, B) returns an array of shape Nx1, where binary_crossentropy(A, B)[i] correspond to the average binary cross-entropy between A[i] and B[i] (i.e. it computes the binary cross-entropy between A[i][j] and B[i][j], for all j, then it computes the average of the M binary cross-entropies).

If you want to calculate the binary cross-entropy between the element A(i, j) and B(i, j), for all i and j, then you may first want to reshape A and B, so that they have the shape (N*M)x1.

import numpy as np
import tensorflow as tf

a = np.random.rand(4, 2).reshape((-1, 1))
b = np.random.rand(4, 2).reshape((-1, 1))
print("ce between a[i, j] and b[i, j]) =", tf.losses.binary_crossentropy(a, b))
print("average cross-entropy =", np.mean(tf.losses.binary_crossentropy(a, b)))

However, if you want to compute the binary cross-entropy between A and B element-wise and take the average of all the binary cross-entropies, then you do not need to reshape A and B. So, if A and B are NxM arrays, then binary_crossentropy(A, B) produces an Nx1 array, where each element corresponds to the average binary cross-entropy between row i of A and row i of B (for i=1, ..., N). Finally, to take the average of all binary cross-entropies, we also need to take the average of binary_crossentropy(A, B), i.e. tf.reduce_mean(binary_crossentropy(A, B)).

nbro
  • 15,395
  • 32
  • 113
  • 196
0

You can easily define this function using the backend functions sum and binary_crossentropy (or use their equivalents in Tensorflow directly):

def func(A, B):
    return K.sum(K.binary_crossentropy(A,B)) 

Note that K.binary_crossentropy() assumes that the given input values are probabilities; if that's not the case then pass from_logit=True as another argument to it.

Further, if you would like to use this function in a Lambda layer, then you need to change it so that it accepts a list of tensors as input:

def func(inp):
    return K.sum(K.binary_crossentropy(inp[0], inp[1]), [1,2]) # take the sum for each sample independently

# ...
out = Lambda(func)([A, B])

As you can see, [1,2] has been passed to K.sum() as its axis argument to take the sum over all the element of a single sample (and not over the whole batch).

today
  • 32,602
  • 8
  • 95
  • 115