For questions specific to spaCy version 3, an Industrial-Strength Natural Language Processing tool in Python. Use the more generic tag `spacy` for general questions about the spaCy.
Questions tagged [spacy-3]
334 questions
4
votes
1 answer
How to I update my trained space ner model with new training dataset?
I'm new to nlp, I started learning how to train the custom ner in spacy.
TRAIN_DATA = [
('what is the price of polo?', {'entities': [(21, 25, 'Product')]}),
('what is the price of ball?', {'entities': [(21, 25, 'Product')]}),
…

Stark
- 197
- 1
- 7
4
votes
0 answers
Spacy alignment differences when training with DocBin vs custom data reading and batching
I am just getting started with training Spacy named entity recognition models, and following the basic example described here, where you create training examples by instantiating Doc objects and serializing those with DocBin.
My custom preprocess.py…

user94154
- 16,176
- 20
- 77
- 116
4
votes
0 answers
MLflow saving a run to a specific experiment id
I am trying to save runs called from MLflow Projects to specific experiment names/ids.
Currently, my MLflow project looks like this:
name: Echo NLP Project
entry_points:
generate:
parameters:
...
command: "python…

Farrandi
- 41
- 2
4
votes
1 answer
How to properly use transformer model for custom NER in spaCy v3?
I'm trying to train a Named Entity Recognition (NER) model for custom tags using spaCy version 3. I went through all the documentation on their website but I cannot understand what's the proper way to create the pipeline model. Apparently, if I try…

archity
- 562
- 3
- 11
- 22
4
votes
1 answer
correctly tokenize english contractions with unicode apostrophes
How do you modify the default spacy (v3.0.5) tokenizer to correctly split english contractions if unicode apostrophes (not ') are used.
import spacy
nlp = spacy.load('en_core_web_sm')
apostrophes = ["'",'\u02B9', '\u02BB', '\u02BC', '\u02BD',…

gustavz
- 2,964
- 3
- 25
- 47
4
votes
1 answer
nlp.update issue with Spacy 3.0: TypeError: [E978] The Language.update method takes a list of Example objects, but got: {}
With Spacy version 3.0 there seem to be some changes with nlp.update.
I am utterly confused with this simple code:
examples = TRAIN_DATA
random.shuffle(examples)
losses = {}
for batch in minibatch(examples, size=8):
nlp.update(batch,…

Nebukadnezar
- 87
- 2
- 11
3
votes
4 answers
AttributeError: module 'click.utils' has no attribute '_expand_args', when i'm tring to install en_core_web_sm
I am trying to install "en_core_web_sm",
the commands i ran is:
pip install spacy ( which ran perfectly and got installed)
python -m spacy download en_core_web_sm ( here i'm getting error "AttributeError: module 'click.utils' has no attribute…

shubham k.
- 41
- 2
3
votes
1 answer
Spacy Rule-Based Matching outputs undesired phrase bit
I was reproducing a Spacy rule-matching example:
import spacy
from spacy.matcher import Matcher
nlp = spacy.load("en_core_web_md")
doc = nlp("Good morning, I'm here. I'll say good evening!!")
pattern = [{"LOWER": "good"},{"LOWER": {"IN":…

Aureon
- 141
- 9
3
votes
1 answer
spaCy model .from_disk doesn't load patterns
For a Named Entity Recognition task in Dutch with spaCy, I added entities using EntityRuler. When I add the ruler to the pipeline in my notebook:
nlp = spacy.load("nl_core_news_md")
ruler = nlp.add_pipe("entity_ruler", before="ner")
patterns =…

IneG
- 73
- 7
3
votes
0 answers
Output multiple possible tags with spaCy spancat
The problem I'm working on involves span categorisation with spaCy, however some of the tags are ambiguous, e.g. span1 => 60% tag1, 40% tag2
I'm trying to figure out if there is a way to get Spacy's output to capture this ambiguity.
I have been…

alex
- 31
- 2
3
votes
1 answer
Getting similarity score with spacy and a transformer model
I've been using the spacy en_core_web_lg and wanted to try out en_core_web_trf (transformer model) but having some trouble wrapping my head around the difference in the model/pipeline usage.
My use case looks like the following:
import spacy
from…

Connor
- 393
- 2
- 9
3
votes
1 answer
SpaCy 3.0 - Fine-tuning only NER component while keeping rest intact
I have some training data for a new set of NER labels that are not currently covered in SpaCy's default NER model. I have prepared a training_data.spacy file - which exclusively contain annotated examples with new labels. I am able to train a blank…

abhinavkulkarni
- 2,284
- 4
- 36
- 54
3
votes
2 answers
Spacy tokenization add extra white space for dates with hyphen separator when I manually build the Doc
I've been trying to solve a problem with the spacy Tokenizer for a while, without any success. Also, I'm not sure if it's a problem with the tokenizer or some other part of the pipeline.
Description
I have an application that for reasons besides the…

Emiliano Viotti
- 1,619
- 2
- 16
- 30
3
votes
0 answers
How can I configure spacy to train on GPU?
I am trying to make a custom NER model using Spacy. I wish to use my GPU for training. This is my config.cfg
[paths]
train = "../training_dataset/training.spacy"
dev = "../training_dataset/dev.spacy"
vectors = null
init_tok2vec =…

Vedant Jumle
- 133
- 2
- 11
3
votes
1 answer
How to solve the problem of importing when trying to import 'SentenceSegmenter' from 'spacy.pipeline' package?
ImportError: cannot import name 'SentenceSegmenter' from 'spacy.pipeline'
Spacy version: 3.2.1
I know this class is for an earlier version of spacy, but would it have something similar for this version of spacy?

arthuraa98
- 73
- 4