Using weights in CrossEntropyLoss and BCELoss (PyTorch)

Question

I am training a PyTorch model to perform binary classification. My minority class makes up about 10% of the data, so I want to use a weighted loss function. The docs for BCELoss and CrossEntropyLoss say that I can use a 'weight' for each sample.

However, when I declare CE_loss = nn.BCELoss() or nn.CrossEntropyLoss() and then do CE_Loss(output, target, weight=batch_weights), where output, target, and batch_weights are Tensors of batch_size, I get the following error message:

forward() got an unexpected keyword argument 'weight'

score 10 · Answer 1 · answered Jul 30 '21 at 06:02

Another way you could accomplish your goal is to use reduction=none when initializing the loss and then multiply the resulting tensor by your weights before computing the mean. e.g.

loss = torch.nn.BCELoss(reduction='none')
model = torch.sigmoid

weights = torch.rand(10,1)
inputs = torch.rand(10,1)
targets = torch.rand(10,1)

intermediate_losses = loss(model(inputs), targets)
final_loss = torch.mean(weights*intermediate_losses)

Of course for your scenario you still would need to calculate the weights tensor. But hopefully this helps!

score 3 · Answer 2 · answered May 31 '21 at 17:57

Could it be that you want to apply separate fixed weights to all elements of class 0 and class 1 in your dataset? It is not clear what value you are passing for batch_weights here. If so, then that is not what the weight parameter in BCELoss does. The weight parameter expects you to pass a separate weight for every ELEMENT in the dataset, not for every CLASS. There are several ways around this. You could construct a weight table for every element. Alternatively, you could use a custom loss function that does what you want:

def BCELoss_class_weighted(weights):

    def loss(input, target):
        input = torch.clamp(input,min=1e-7,max=1-1e-7)
        bce = - weights[1] * target * torch.log(input) - (1 - target) * weights[0] * torch.log(1 - input)
        return torch.mean(bce)

  return loss

Note that it is important to add a clamp to avoid numerical instability.

HTH Jeroen

score 1 · Answer 3 · answered May 27 '21 at 22:08

1

the issue is wherein your providing the weight parameter. As it is mentioned in the docs, here, the weights parameter should be provided during module instantiation.

For example, something like,

from torch import nn
weights = torch.FloatTensor([2.0, 1.2]) 
loss = nn.BCELoss(weights=weights)

You can find a more concrete example here or another helpful PT forum discussion here.

answered May 27 '21 at 22:08

randomseed42

99
7

1

The documentation for BCELoss says that 'weight' should be 'a manual rescaling weight given to the loss of each batch element. If given, has to be a Tensor of size nbatch.' What if the weights will change for each batch? – clueless May 27 '21 at 23:27
On the face of it I don't think that's possible. On top of it, I think having per-batch dynamic rescaling weights might have a negative impact on learning,as the loss function attributes are constantly changing per batch. – randomseed42 Jun 11 '21 at 20:23
BCELoss does not work like this. The weights are for the batch elements. Try it: `bce_loss = torch.nn.BCELoss(weight=torch.tensor([1, 50]))` then `labels = torch.tensor(np.random.randn(12) > 1).double()` then `outputs = torch.nn.Sigmoid(torch.tensor(np.random.randn(12))).double()` then `bce_loss(outputs, labels)` results in the following error. "RuntimeError: The size of tensor a (12) must match the size of tensor b (2) at non-singleton dimension 0" – Finncent Price Jun 23 '23 at 20:31

Sudhanshu · Answer 4 · 2021-05-27T22:08:07.620

0

you need to pass weights like below:

CE_loss = CrossEntropyLoss(weight=[…])

edited May 27 '21 at 22:08

answered May 27 '21 at 21:59

Sudhanshu

704
1
9
24

score 0 · Answer 5 · answered Aug 05 '22 at 13:22

This is similar to the idea of @Jeroen Vuurens, but the class weights are determined by the target mean:

y_train_mean = y_train.mean()
            
bi_cls_w2 = 1/(1 - y_train_mean)
bi_cls_w1 = 1/y_train_mean - bi_cls_w2

bce_loss = nn.BCELoss(reduction='none')
loss_fun = lambda pred, target: ((bi_cls_w1*target + bi_cls_w2) * bce_loss(pred, target)).mean()

Using weights in CrossEntropyLoss and BCELoss (PyTorch)

5 Answers5