4

I'm trying to train a Named Entity Recognition (NER) model for custom tags using spaCy version 3. I went through all the documentation on their website but I cannot understand what's the proper way to create the pipeline model. Apparently, if I try to use en_core_web_trf, I'm unable to add my own tags, because the final output scores are all zeroes. But it works correctly for en_core_web_sm.

However, if I try to do some makeshift method by creating a blank model of English and then manually adding transformer model from en_core_web_trf and ner model separately from en_core_web_sm, it works.

My question is - is there a better way to initialize my model and the pipeline methods, other than this makeshift method? I do not care about pre-trained entities like LOCATION, etc. I just want to train my model (using a transformer-based approach) based on custom entities that I have defined in my dataset.

def load_spacy():
    spacy.require_gpu()

    # 1) 'Makeshift' method
    source_nlp = spacy.load("en_core_web_sm")
    source_nlp_trf = spacy.load("en_core_web_trf")
    nlp = spacy.blank("en")
    nlp.add_pipe("transformer", source=source_nlp_trf)
    nlp.add_pipe("ner", source=source_nlp)
    
    # 2) trf only method
    nlp = spacy.load("en_core_web_trf")
    
    # Getting the pipeline component
    ner = nlp.get_pipe("ner")
    return ner, nlp

Edit: The exact training methodology I used has been described in this python script in the fit() function of the defined class NerModel.

The load_spacy() in the script (line no. 16) uses the small model, but I was experimenting with the transformer model and used the definition of load_spacy() which I defined at the beginning of this question.

PS: I have been experimenting on Google Colab (aka a notebook) in order to make use of GPU for transformer, but the source code and methodology are almost the same.

archity
  • 562
  • 3
  • 11
  • 22
  • Can you describe your training process following the docs more? Did you try the debug data command? If you can train with the small model but not transformers that's really weird. If it's a bug we'd like to fix it. – polm23 Jun 22 '21 at 10:22
  • @polm23 I updated my answer with the training methodology (added link to Python script). I basically get 0 values for all scores when if I use `nlp = spacy.load("en_core_web_trf")` while loading the model ("trf only method") – archity Jun 22 '21 at 16:45
  • Ah, you wrote your own training loop. Can you try using the config based training outlined in the docs instead? It's really hard to debug a whole training loop, and there's lots of places something there could go wrong. – polm23 Jun 23 '21 at 02:45

1 Answers1

0
## combining both pre-trained and custom model 

import spacy
nlp = spacy.load("en_core_web_trf")
cust_nlp = spacy.load(r'enter the location of your custom model') 
cust_nlp.replace_listeners("tok2vec", "ner", ["model.tok2vec"])
nlp.add_pipe("ner",name="ner_custom",source=cust_nlp,after="ner")
doc = nlp("enter or paste the sentence that you wanted to test")
for ent in doc.ents:
    print(ent.text,ent.label_)
  • Welcome to SO! Please, avoid code-only answers by providing a brief explanation on how the code solves the problem. – Berriel May 04 '23 at 11:10