I have a set of 256x256 images that are each labeled with nine, binary 256x256 masks. I am trying to calculate the pos_weight
in order to weight the BCEWithLogitsLoss
using Pytorch.
The shape of my masks tensor is tensor([1000, 9, 256, 256])
where 1000 is the number of training images, 9 is the number of mask channels (all encoded to 0/1), and 256 is the size of each image side.
To calculate pos_weight, I have summed the zeros in each mask, and divided that number by the sum of all of the ones in each mask (following the advice suggested here.):
(masks[:,channel,:,:]==0).sum()/masks[:,channel,:,:].sum()
Calculating the weight for every mask channel provides a tensor with the shape of tensor([9])
, which seems intuitive to me, since I want a pos_weight value for each of the nine mask channels. However when I try to fit my model, I get the following error message:
RuntimeError: The size of tensor a (9) must match the size of
tensor b (256) at non-singleton dimension 3
This error message is surprising because it suggests that the weights need to be the size of one of the image sides, but not the number of mask channels. What shape should pos_weight
be and how do I specify that it should be providing weights for the mask channels instead of the image pixels?