Actually, it's not exactly the mean, or, more precisely, not always the mean.
tf.losses.sigmoid_cross_entropy
has a reduction
argument (by default equal to Reduction.SUM_BY_NONZERO_WEIGHTS
) and a weights
argument (by default 1.0
):
weights
: Optional Tensor
whose rank is either 0, or the same rank as labels
, and must be broadcastable to labels
(i.e., all dimensions must be either 1, or the same as the corresponding losses dimension).
reduction
: Type of reduction to apply to loss.
There are several types of reduction:
Reduction.SUM_BY_NONZERO_WEIGHTS
computes the SUM
divided by number of non-zero weights.
Reduction.SUM
is the weighted sum.
Reduction.MEAN
is the weighted mean.
Reduction.NONE
means no reduction (the result shape is the same as input).
As you can see, the result depends on both of them. Yes, when both have default values, the loss equals to the mean. But if one of them is non-default, e.g., one of the weights is zero, the mean will be computed over non-zero weights, not over the whole batch.