How to fine tune BERT on its own tasks?

Question

I wanted to pre-train BERT with the data from my own language since multilingual (which includes my language) model of BERT is not successful. Since whole pre-training costs a lot, I decided to fine tune it on its own 2 tasks: masked language model and next sentence prediction. There are previous implementation on different tasks (NER, sentiment analysis etc.), but I couldn't find any fine tuning on its own tasks. Is there an implementation that I couldn't see? If not, where should I start? I need some initial help.

Wasi Ahmad · Answer 1 · 2019-05-03T21:51:16.210

A wonderful resource for BERT is: https://github.com/huggingface/pytorch-pretrained-BERT. This repository contains op-for-op PyTorch reimplementations, pre-trained models and fine-tuning examples for Google's BERT model.

You can find the language model fine-tuning examples in the following link. The three example scripts in this folder can be used to fine-tune a pre-trained BERT model using the pretraining objective (the combination of masked language modeling and next sentence prediction loss).

https://github.com/huggingface/pytorch-pretrained-BERT/tree/master/examples/lm_finetuning

By the way, BERT multilingual is available for 104 languages (ref), and it is found to be surprisingly effective in many cross-lingual NLP tasks (ref). So, make sure you use BERT appropriately in your task.

does huggingface have any keras and tensorflow implementations? — Josh Zwiebel, Mar 09 '20 at 03:25

How to fine tune BERT on its own tasks?

1 Answers1