1

I am creating a blank model that will be used to pass resumes in it, however the model is not training with the training data I am feeding it, since it produces no losses.

import spacy import pickle import random

train_data = pickle.load(open('train_data (1).pkl', 'rb'))

nlp = spacy.blank('en')


def train_model(train_data):
    if 'ner' not in nlp.pipe_names:
        ner = nlp.create_pipe('ner')
        nlp.add_pipe('ner')

    for _, annotation in train_data:
        for ent in annotation['entities']:
            ner.add_label(ent[2])

    pipe_exceptions = ["ner", "trf_wordpiecer", "trf_tok2vec"]
    other_pipes = [pipe for pipe in nlp.pipe_names if pipe not in pipe_exceptions]

    with nlp.disable_pipes(*other_pipes):

        optimizer = nlp.begin_training()
        for itn in range(10):
            print("starting iteration " + str(itn))
            random.shuffle(train_data)
            losses = {}
            for text, annotations in train_data:
                try:
                    nlp.update([text], [annotations], drop=0.2, sgd=optimizer, losses=losses)

                except Exception as e:
                    pass

            print(losses)


train_model(train_data)

this is the output I get:

starting iteration 0 {} starting iteration 1 {} starting iteration 2 {} starting iteration 3 {} starting iteration 4 {} starting iteration 5 {} starting iteration 6 {} starting iteration 7 {} starting iteration 8 {} starting iteration 9 {}

  • It looks like you're following the spaCy v2 way of training. I recommend you take a look at the spaCy course and use v3, which has been out over a year now and is easier to use. The course has a full example of training an NER model you can use to get started. https://course.spacy.io/en/ – polm23 Feb 15 '22 at 06:09
  • What version of spacy are you using? – tauft Feb 18 '22 at 06:15

0 Answers0