2

I have a pre-trained model from facebook/bart-large-mnli I used the Trainer in order to train it on my own dataset.

model = BartForSequenceClassification.from_pretrained("facebook/bart-large-mnli", num_labels=14, ignore_mismatched_sizes=True)

And then after I train it, I try to use the following (creating a pipeline with the fine-tuned model):

# Import the Transformers pipeline library
from transformers import pipeline

# Initializing Zero-Shot Classifier
classifier = pipeline("zero-shot-classification", model=model, tokenizer=tokenizer, id2label=id2label)

I get the following error from it:

Failed to determine 'entailment' label id from the label2id mapping in the model config. Setting to -1. Define a descriptive label2id mapping in the model config to ensure correct outputs.

I tried searching the web for a solution but I can't find anything, you can refer to my previous question when I had trouble training it here


How to solve the first error:

Applying this solves the first error.

Second error:

I'm getting the following error:

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select)

I tried deleting my custom metrics and it fixed it for a while but it didn't last, this error keeps coming.

The error is coming from here:

sequences = "Some text sequence"
classifier = pipeline("zero-shot-classification", model=model, tokenizer=tokenizer)
classifier(sequences, list(id2label.values()), multi_label=False)
# id2label is a dictionary mapping each label to its integer ID

I also tried trainer.save_model(actual_model) but it saved only some of the stuff and when I loaded it it was like I didn't train it at all.


If I change the line to:

classifier = pipeline("zero-shot-classification", model=model, tokenizer=tokenizer) # OLD

classifier = pipeline("zero-shot-classification", model=model.to('cpu'), tokenizer=tokenizer) # NEW

It works fine, but if I change it to:

classifier = pipeline("zero-shot-classification", model=model.to('cuda'), tokenizer=tokenizer)

I get the same error too, my model was trained on a GPU cluster and Iw ant to test it as such, is it possible of am I missing something?

From what I checked the option the to function can get are: cpu, cuda, ipu, xpu, mkldnn, opengl, opencl, ideep, hip, ve, fpga, ort, xla, lazy, vulkan, mps, meta, hpu, privateuseone

Dolev Mitz
  • 103
  • 14
  • It depends on how exactly you finetune, but I guess you might want to use `text-classification` pipeline instead of the zero-shot one? – Jindřich May 04 '23 at 07:42
  • I need the zero-shot one, and you can look at the previous questions to see the arguments for the fine-tuning :) – Dolev Mitz May 04 '23 at 07:44
  • You used the wrong finetuning approach to perform zero-shot classification. Can you please explain what you want to do (e.g. the data you have and what the model should do)? Please keep the explanation at that simple level and avoid technical terms. @DolevMitz – cronoik May 04 '23 at 09:30
  • I have roughly about 14 types of labels, I want to finetune this model on my own data for better and more precise classification – Dolev Mitz May 07 '23 at 09:54
  • @cronoik I will :) Yes, I do because there is a possibility that my 14 labels won't cover all that there are. – Dolev Mitz May 07 '23 at 21:27
  • Please note that SO is not a 24-7-365 support. Most people at SO use their free time. I answered your question [here](https://stackoverflow.com/a/76213874/6664872) because it didn't really answer your post as it is currently written. @DolevMitz – cronoik May 09 '23 at 23:05
  • 1
    @cronoik Yeah I know that you are not here 24-7-365, I will have a look at the link you provided – Dolev Mitz May 10 '23 at 04:31
  • @cronoik But thanks to the `num_labels=14, ignore_mismatched_sizes=True` I was able to solve the [last error I had](https://stackoverflow.com/questions/76099140/hugging-face-transformers-bart-cuda-error-cublas-status-not-initialize) – Dolev Mitz May 10 '23 at 13:42
  • Ok you were right I was able to create a classifier like that, but now I'm getting the error `RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select)` When I'm using `classifier(sequences, list(id2label.values()), multi_label=False)`. Do you have any idea why? @cronoik – Dolev Mitz May 10 '23 at 13:55
  • I was able to work it out, turns out my new metrics were wrong for zero-shot so that's why that error was going, I just deleted my new metrics. Question for you, how can I measure and test scores of the model? I didn't find anything on the internet about it. @cronoik – Dolev Mitz May 11 '23 at 05:15
  • @DolevMitz can you please delete the comments that are not relevant to your question? I would say everything except the first one. – cronoik May 11 '23 at 09:43
  • I updated the question accordingly, because I'm still having a problem with it @cronoik – Dolev Mitz May 16 '23 at 12:01
  • @cronoik I added another update – Dolev Mitz May 17 '23 at 05:03

1 Answers1

0

After the model training, your model seems to be still placed on your GPU. The error message you receive:

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select)

is thrown, because the input tensors that are generated from the pipeline are still on cpu. That is also the reason why the pipeline works as expected when you move the model to cpu with model.to('cpu').

Per default, the pipeline will perform its actions on cpu, you change that behavior by specifying the device parameter.

# cuda
classifier = pipeline("zero-shot-classification", model=model, tokenizer=tokenizer, device=0)

#cpu
classifier = pipeline("zero-shot-classification", model=model, tokenizer=tokenizer, device="cpu")
cronoik
  • 15,434
  • 3
  • 40
  • 78
  • I have a continue question, I would appreciate you looking there when you can. [https://stackoverflow.com/questions/76277980/huggingface-evaluate-a-fine-tuned-zero-shot-model](https://stackoverflow.com/questions/76277980/huggingface-evaluate-a-fine-tuned-zero-shot-model) – Dolev Mitz May 18 '23 at 05:37
  • I even added a bounty to it, if you'll be interested :) – Dolev Mitz May 22 '23 at 04:41