According to Pytorch's documentation on binary_cross_entropy_with_logits, they are described as:
weight
weight (Tensor, optional) – a manual rescaling weight if provided it’s repeated to match input tensor shape
pos_weight
pos_weight (Tensor, optional) – a weight of positive examples. Must be a vector with length equal to the number of classes.
What are their differences? The explanation is quite vague. If I understands correctly, weight is individual weight for each pixel (class), wheres pos_weight is the weight for everything that's not background (negative pixel/zero)?
What if I set both parameters? For example:
import torch
preds = torch.randn(4, 100, 50, 50)
target = torch.zeros((4, 100, 50, 50))
target[:, :, 10:20, 10:20] = 1
pos_weight = target * 100
pos_weight[pos_weight < 100] = 1
weight = target * 100
weight[weight < 100] = 1
loss1 = binary_cross_entropy_with_logits(preds, target, pos_weight=pos_weight, weight=weight)
loss2 = binary_cross_entropy_with_logits(preds, target, pos_weight=pos_weight)
loss3 = binary_cross_entropy_with_logits(preds, target, weight=weight)
loss1
, loss2
, and loss3
, which one is the correct usage?
On the same subject, I was reading a paper that said:
To deal with the unbalanced negative and positive data, we dilate each keypoint by 10 pixels and use weighted cross-entropy loss. The weight for each keypoint is set to 100 while for non-keypoint pixels it is set to 1.
which one is the correct usage if according to the paper?
Thanks in advance for any explanation!