Does sigmoid_cross_entropy produce the mean loss over the whole batch?

Question

I have a multi-label classification task and there are 6 labels. Any sample may have none or some labels to be 1. I have use the loss in tensorflow:

self.loss = tf.losses.sigmoid_cross_entropy(self.labels, self.logits)

Every time a batch (1000) of samples come in, the loss calculated.

But I am not sure whether the loss is the average of the log loss of each predicted column. If it is not, how can I modify the loss to be the mean column-wise log loss mentioned above?

score 3 · Answer 1 · answered Jan 24 '18 at 14:39

Actually, it's not exactly the mean, or, more precisely, not always the mean.

tf.losses.sigmoid_cross_entropy has a reduction argument (by default equal to Reduction.SUM_BY_NONZERO_WEIGHTS) and a weights argument (by default 1.0):

weights: Optional Tensor whose rank is either 0, or the same rank as labels, and must be broadcastable to labels (i.e., all dimensions must be either 1, or the same as the corresponding losses dimension).

reduction: Type of reduction to apply to loss.

There are several types of reduction:

Reduction.SUM_BY_NONZERO_WEIGHTS computes the SUM divided by number of non-zero weights.
Reduction.SUM is the weighted sum.
Reduction.MEAN is the weighted mean.
Reduction.NONE means no reduction (the result shape is the same as input).

As you can see, the result depends on both of them. Yes, when both have default values, the loss equals to the mean. But if one of them is non-default, e.g., one of the weights is zero, the mean will be computed over non-zero weights, not over the whole batch.

Thanks. If there is no none value in both logits and labels, the loss shall be the mean of columns. — yanachen, Jan 25 '18 at 03:30

score 0 · Answer 2 · answered Jan 24 '18 at 09:30

0

Yes, any loss in tensorflow produce mean over the complete batch. So, if batchwise loss is as follows

[2, 3, 1, 4]

Net loss will 2.5

answered Jan 24 '18 at 09:30

layog

4,661
1
28
30

Does sigmoid_cross_entropy produce the mean loss over the whole batch?

2 Answers2