I am trying to predict entities using a custom trained NER model using spacy. I read https://github.com/explosion/spaCy/pull/8855 that confidence scores of each entity can be obtained using spancat. But I have a little confusion regarding to make that work. According to my understanding, we have to train a pipeline using spancat component. So while training, within the config file there is a segment,
[nlp]
lang = "en"
pipeline = ["tok2vec","ner"]
batch_size = 1000
Should we have to change this to
[nlp]
lang = "en"
pipeline = ["tok2vec","ner","spancat"]
batch_size = 1000
for the spancat to work.
Then after training, while predicting the entities from unknown text, should we have to use
doc = nlp(data_to_be_predicted)
spans = doc.spans["spancat"] # SpanGroup
print(spans.attrs["scores"]) # list of numbers, span length as SpanGroup
to get the confidence scores.
I am using spacy 3.1.3. I believe according to the documentation, this feature is rolled out by now.