Finetune mBART on pre-train tasks using HuggingFace

Question

I would like to finetune facebook/mbart-large-cc25 on my data using pre-training tasks, in particular Masked Language Modeling (MLM).

How can I do that in HuggingFace?

Edit: rewrote the question for the sake of clarity

I think for the most part you can simply follow the existing Q&A scripts (e.g., [these ones](https://github.com/huggingface/transformers/tree/master/examples/pytorch/question-answering)) and substitute in MBart. If you do need specific help, please make sure your post only includes *a single question*, to ensure answers are consistent. — dennlinger, Sep 23 '21 at 09:44

score 0 · Answer 1 · answered Sep 23 '21 at 09:44

0

Since you are doing everything in HuggingFace, fine-tuning a model on pre-training tasks (assuming that pre-training task is provided in Huggingface) is pretty much the same for most models. What tasks are you interested in fine-tuning mBART on?

Hugginface provides extensive documentation for several fine-tuning tasks. For instance the links provided below will help you fine tune HF models for Language modelling, MNLI, SQuAD etc. https://huggingface.co/transformers/v2.0.0/examples.html and https://huggingface.co/transformers/training.html

answered Sep 23 '21 at 09:44

Zain Sarwar

1,226
8
10

Please note that the link to the examples is *extremely* outdated. Please refer to this one instead (latest version): https://huggingface.co/transformers/master/examples.html – dennlinger Sep 23 '21 at 09:53
I would like to do MLM using HF, but I didn't find any tutorial – albero Sep 23 '21 at 09:54
1

This link should help : https://github.com/huggingface/transformers/tree/master/examples/pytorch/language-modeling The script run_mlm.py can be used for finetuning your HF model. – Zain Sarwar Sep 23 '21 at 10:07
mBART is not supported according to the comment at line 19 https://github.com/huggingface/transformers/blob/62832c962f85b5a554ebf8b930d13b76b9028a8d/examples/pytorch/language-modeling/run_mlm.py#L19 – albero Sep 23 '21 at 14:03

Finetune mBART on pre-train tasks using HuggingFace

1 Answers1