0

I'm following the multiple choice QA tutorial and trying to modify it slightly to fit my data. My data is exactly the same, except that I have 5 labels instead of 4:

# original data:
from datasets import load_dataset
swag = load_dataset("swag", "regular")
set(swag["train"]['label'])
>>> {0, 1, 2, 3}

# my data:
set(train_dataset["train"]['label'])
>>>
{0, 1, 2, 3, 4}

I'm running the code in the tutorial and getting the error: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [9,0,0] Assertion t >= 0 && t < n_classes failed.

I found from here and here that this can be caused when the target values are out of bounds, which can happend when using nn.CrossEntropyLoss which expects a torch.LongTensor with values in the range [0, nb_classes-1].

I will not copy the entire script from the tutorial since it's in the link above, but I found that the error can be replicated by modifying the DataCollatorForMultipleChoice function by adding an extra label as follows:

from random import choices

@dataclass
class DataCollatorForMultipleChoice:
    """
    Data collator that will dynamically pad the inputs for multiple choice received.
    """

    tokenizer: PreTrainedTokenizerBase
    padding: Union[bool, str, PaddingStrategy] = True
    max_length: Optional[int] = None
    pad_to_multiple_of: Optional[int] = None

    def __call__(self, features):
        label_name = "label" if "label" in features[0].keys() else "labels"
        labels = [feature.pop(label_name) for feature in features]
        labels = [random.choice(range(5)) for _ in range(16)] #<<<---ADDING EXTRA LABEL HERE. INSTEAD OF 0-4 THIS IS BETWEEN 0-5

        print(len(labels))
        print(labels)
        batch_size = len(features)
        num_choices = len(features[0]["input_ids"])
        flattened_features = [
            [{k: v[i] for k, v in feature.items()} for i in range(num_choices)] for feature in features
        ]
        flattened_features = sum(flattened_features, [])

        batch = self.tokenizer.pad(
            flattened_features,
            padding=self.padding,
            max_length=self.max_length,
            pad_to_multiple_of=self.pad_to_multiple_of,
            return_tensors="pt",
        )

        batch = {k: v.view(batch_size, num_choices, -1) for k, v in batch.items()}
        batch["labels"] = torch.tensor(labels, dtype=torch.int64)
        return batch

Then when I run the trainer I get:

16 # batch
[0, 0, 2, 1, 1, 1, 0, 4, 0, 4, 3, 0, 0, 0, 1, 1] # labels
... nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [7,0,0] Assertion `t >= 0 && t < n_classes` failed.

I tried changing the number of labels in the model:

# original:
# model = AutoModelForMultipleChoice.from_pretrained("bert-base-uncased")
# my modification:
model = AutoModelForMultipleChoice.from_pretrained("bert-base-uncased", num_labels=5)

but I got the same error.

The script runs just fine with my data if I modify the added line from above to

labels = [random.choice(range(4)) for _ in range(16)] # note that now it's from 0-4 and not from 0-5
Penguin
  • 1,923
  • 3
  • 21
  • 51

0 Answers0