2

This is a question regarding training models on SPACY3.x.

I couldn't find a good answer/solution on StackOverflow hence the query.

If I am using the existing model in spacy like the en model and want to add my own entities in the model and train it, let's say since I work in the biomedical domain, things like virus name, shape, length, temperature, temperature value, etc. I don't want to lose the entities tagged by Spacy like organization names, country, etc.

All suggestions are appreciated.

Thanks

ary
  • 159
  • 1
  • 8

1 Answers1

1

There are a few ways to do that.

The best way is to train your own model separately and then combine both models in one pipeline, with one before the other. See the double NER example project for an overview of that.

It's also possible to update the pretrained NER model, see this example project. However this isn't usually a good idea, and definitely not if you're adding completely different entities. You'll run into what's called "catastrophic forgetting", where even though you're technically updating the model, it ends up forgetting everything not represented in your current training data.

polm23
  • 14,456
  • 7
  • 35
  • 59