I want some help regarding adding additional words in the existing BERT model. I have two quires kindly guide me:
I am working on NER task for a domain:
There are few words (not sure the exact numbers) that BERT recognized as [UNK], but those entities are required for the model to recognize. The pretrained model learns well (up to 80%) accuracy on "bert-base-cased" while providing labeled data and fine-tune the model but intuitively the model will learn better if it recognize all the entities.
Do i need to add those unknown entities in vocabs.txt and train the model again?
Do i need to train the BERT model on my data from Scratch?
Thanks...