1

I am running code from another repository, but my issue is general so I am posting it here. Running their code, I get the error along the lines of Expected all tensors to be on the same device, found two: cpu and cuda:0. I have already verified that the model is on cuda:0; the issue is that the dataloader object used is not set to the device. Also, the dataset/models I use here are huggingface-transformers models and huggingface datasets.

Here is the relevant block of code where the issue arises:

    eval_dataset = self.eval_dataset if eval_dataset is None else eval_dataset
    eval_dataloader = self.get_eval_dataloader(eval_dataset)
    eval_examples = self.eval_examples if eval_examples is None else eval_examples
    compute_metrics = self.compute_metrics
    self.compute_metrics = None
    eval_loop = (self.prediction_loop if self.args.use_legacy_prediction_loop else self.evaluation_loop)
    try:
        #this is where the error occurs
        output = eval_loop(
            eval_dataloader,
            description="Evaluation",
            prediction_loss_only=True if compute_metrics is None else None,
            ignore_keys=ignore_keys,
        )

For context, this occurs inside an evaluate() method of a class inheriting from Seq2SeqTrainer from huggingface. I have tried using something like

for i, (inputs, labels) in eval_dataloader:
             inputs, labels = inputs.to(device), labels.to(device)

But that doesn't work (it gives an error of Too many values to unpack (expected 2). Is there any other way I can send this dataloader to the GPU? In particular, is there any way I can edit the evaluation_loop method of Transformers Trainer to move the batches to the GPU or something?

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
nlp4892
  • 61
  • 7

0 Answers0