How to incoporate mask into negative likelihood loss (torch.nn.functional.nll_loss)

Question

Hello I am implementing a lstm for language modelling for homework and I am at the loss implementation phase. Our instructor told us to use F.nll_loss but the sequences are padded and we have to take into account a mask that is given which tells us when the sequences stop.

input:

log_probas (batch_size, sequence_length(padded), vocabulary size)
targets (batch_size, sequence_length(padded))
mask (batch_size, sequence_length(padded)

naive implementation which works without taking into account the mask:

import torch.nn.functional as F
loss = F.nll_loss(log_probas.transpose(1, 2), targets)

I've been crawling the internet and banging my head but can't seem to find an answer on how to incorporate a mask into the averaging scheme of the loss.

score 2 · Answer 1 · answered Mar 18 '21 at 09:22

2

you could reshape the tensors and use mask to select non-padded tokens, and compute the loss

vocab_size = log_probas.size(-1)
log_probas = log_probas.view(-1, vocab_size)
target = target.view(-1)
mask = mask.view(-1).bool()
loss = F.nll_loss(log_probas[mask], targets[mask])

answered Mar 18 '21 at 09:22

emily

178
4

simple and efficient...magnifique – Allohvk Mar 23 '23 at 17:02

How to incoporate mask into negative likelihood loss (torch.nn.functional.nll_loss)

1 Answers1