I want to write a function with two arguments, A
and B
, tensors of the same shape (for example, 13x13
, or some other shape), and that returns a number that represents the summation of all losses when applied binary cross-entropy componentwise. So, for A[i, j]
and B[i, j]
we find the binary cross-entropy loss, and then sum over all i
and j
. How to implement that in Keras and Tensorflow?
2 Answers
The solution suggested in this answer may actually not be what you (reader) are looking for.
If A
and B
are NxM
, where M > 1
, then binary_crossentropy(A, B)
will not compute the binary cross-entropy element-wise, but binary_crossentropy(A, B)
returns an array of shape Nx1
, where binary_crossentropy(A, B)[i]
correspond to the average binary cross-entropy between A[i]
and B[i]
(i.e. it computes the binary cross-entropy between A[i][j]
and B[i][j]
, for all j
, then it computes the average of the M
binary cross-entropies).
If you want to calculate the binary cross-entropy between the element A(i, j)
and B(i, j)
, for all i
and j
, then you may first want to reshape A
and B
, so that they have the shape (N*M)x1
.
import numpy as np
import tensorflow as tf
a = np.random.rand(4, 2).reshape((-1, 1))
b = np.random.rand(4, 2).reshape((-1, 1))
print("ce between a[i, j] and b[i, j]) =", tf.losses.binary_crossentropy(a, b))
print("average cross-entropy =", np.mean(tf.losses.binary_crossentropy(a, b)))
However, if you want to compute the binary cross-entropy between A
and B
element-wise and take the average of all the binary cross-entropies, then you do not need to reshape A
and B
. So, if A
and B
are NxM
arrays, then binary_crossentropy(A, B)
produces an Nx1
array, where each element corresponds to the average binary cross-entropy between row i
of A
and row i
of B
(for i=1, ..., N
). Finally, to take the average of all binary cross-entropies, we also need to take the average of binary_crossentropy(A, B)
, i.e. tf.reduce_mean(binary_crossentropy(A, B))
.

- 15,395
- 32
- 113
- 196
You can easily define this function using the backend functions sum
and binary_crossentropy
(or use their equivalents in Tensorflow directly):
def func(A, B):
return K.sum(K.binary_crossentropy(A,B))
Note that K.binary_crossentropy()
assumes that the given input values are probabilities; if that's not the case then pass from_logit=True
as another argument to it.
Further, if you would like to use this function in a Lambda
layer, then you need to change it so that it accepts a list of tensors as input:
def func(inp):
return K.sum(K.binary_crossentropy(inp[0], inp[1]), [1,2]) # take the sum for each sample independently
# ...
out = Lambda(func)([A, B])
As you can see, [1,2]
has been passed to K.sum()
as its axis
argument to take the sum over all the element of a single sample (and not over the whole batch).

- 32,602
- 8
- 95
- 115