On internet,all I have found is example for classifcation tasks.But,in my problem there is no label.(I only have a set of tweets). My task goes as follows : Generate Word embeddings using BERT,now use this Word embeddings in next task. My objective : I want to fine tune BERT to produce better word embeddings. How to do that?
Asked
Active
Viewed 394 times
0
-
Does this answer your question? [How to train BERT from scratch on a new domain for both MLM and NSP?](https://stackoverflow.com/questions/65646925/how-to-train-bert-from-scratch-on-a-new-domain-for-both-mlm-and-nsp) – Ashwin Geet D'Sa Nov 15 '21 at 12:47
1 Answers
0
You see all classification examples for BERT because it is basically a text classification model. However, there is a BertGeneration interface provided by huggingface which you can use to deploy BERT as a sequence generation model.
If you can do that, then next you can try to fine tune BERT as an autoencoder (by using the same text as input and output), it won't be an autoencoder in strict sense due to masking but it should solve your purpose. Finally, you can use the encoder part (fully or selectively) for training your downstream task.

jdsurya
- 1,326
- 8
- 16