Questions tagged [spacy-3]

For questions specific to spaCy version 3, an Industrial-Strength Natural Language Processing tool in Python. Use the more generic tag `spacy` for general questions about the spaCy.

334 questions
4
votes
1 answer

How to I update my trained space ner model with new training dataset?

I'm new to nlp, I started learning how to train the custom ner in spacy. TRAIN_DATA = [ ('what is the price of polo?', {'entities': [(21, 25, 'Product')]}), ('what is the price of ball?', {'entities': [(21, 25, 'Product')]}), …
Stark
  • 197
  • 1
  • 7
4
votes
0 answers

Spacy alignment differences when training with DocBin vs custom data reading and batching

I am just getting started with training Spacy named entity recognition models, and following the basic example described here, where you create training examples by instantiating Doc objects and serializing those with DocBin. My custom preprocess.py…
user94154
  • 16,176
  • 20
  • 77
  • 116
4
votes
0 answers

MLflow saving a run to a specific experiment id

I am trying to save runs called from MLflow Projects to specific experiment names/ids. Currently, my MLflow project looks like this: name: Echo NLP Project entry_points: generate: parameters: ... command: "python…
Farrandi
  • 41
  • 2
4
votes
1 answer

How to properly use transformer model for custom NER in spaCy v3?

I'm trying to train a Named Entity Recognition (NER) model for custom tags using spaCy version 3. I went through all the documentation on their website but I cannot understand what's the proper way to create the pipeline model. Apparently, if I try…
4
votes
1 answer

correctly tokenize english contractions with unicode apostrophes

How do you modify the default spacy (v3.0.5) tokenizer to correctly split english contractions if unicode apostrophes (not ') are used. import spacy nlp = spacy.load('en_core_web_sm') apostrophes = ["'",'\u02B9', '\u02BB', '\u02BC', '\u02BD',…
gustavz
  • 2,964
  • 3
  • 25
  • 47
4
votes
1 answer

nlp.update issue with Spacy 3.0: TypeError: [E978] The Language.update method takes a list of Example objects, but got: {}

With Spacy version 3.0 there seem to be some changes with nlp.update. I am utterly confused with this simple code: examples = TRAIN_DATA random.shuffle(examples) losses = {} for batch in minibatch(examples, size=8): nlp.update(batch,…
Nebukadnezar
  • 87
  • 2
  • 11
3
votes
4 answers

AttributeError: module 'click.utils' has no attribute '_expand_args', when i'm tring to install en_core_web_sm

I am trying to install "en_core_web_sm", the commands i ran is: pip install spacy ( which ran perfectly and got installed) python -m spacy download en_core_web_sm ( here i'm getting error "AttributeError: module 'click.utils' has no attribute…
shubham k.
  • 41
  • 2
3
votes
1 answer

Spacy Rule-Based Matching outputs undesired phrase bit

I was reproducing a Spacy rule-matching example: import spacy from spacy.matcher import Matcher nlp = spacy.load("en_core_web_md") doc = nlp("Good morning, I'm here. I'll say good evening!!") pattern = [{"LOWER": "good"},{"LOWER": {"IN":…
Aureon
  • 141
  • 9
3
votes
1 answer

spaCy model .from_disk doesn't load patterns

For a Named Entity Recognition task in Dutch with spaCy, I added entities using EntityRuler. When I add the ruler to the pipeline in my notebook: nlp = spacy.load("nl_core_news_md") ruler = nlp.add_pipe("entity_ruler", before="ner") patterns =…
IneG
  • 73
  • 7
3
votes
0 answers

Output multiple possible tags with spaCy spancat

The problem I'm working on involves span categorisation with spaCy, however some of the tags are ambiguous, e.g. span1 => 60% tag1, 40% tag2 I'm trying to figure out if there is a way to get Spacy's output to capture this ambiguity. I have been…
alex
  • 31
  • 2
3
votes
1 answer

Getting similarity score with spacy and a transformer model

I've been using the spacy en_core_web_lg and wanted to try out en_core_web_trf (transformer model) but having some trouble wrapping my head around the difference in the model/pipeline usage. My use case looks like the following: import spacy from…
Connor
  • 393
  • 2
  • 9
3
votes
1 answer

SpaCy 3.0 - Fine-tuning only NER component while keeping rest intact

I have some training data for a new set of NER labels that are not currently covered in SpaCy's default NER model. I have prepared a training_data.spacy file - which exclusively contain annotated examples with new labels. I am able to train a blank…
3
votes
2 answers

Spacy tokenization add extra white space for dates with hyphen separator when I manually build the Doc

I've been trying to solve a problem with the spacy Tokenizer for a while, without any success. Also, I'm not sure if it's a problem with the tokenizer or some other part of the pipeline. Description I have an application that for reasons besides the…
Emiliano Viotti
  • 1,619
  • 2
  • 16
  • 30
3
votes
0 answers

How can I configure spacy to train on GPU?

I am trying to make a custom NER model using Spacy. I wish to use my GPU for training. This is my config.cfg [paths] train = "../training_dataset/training.spacy" dev = "../training_dataset/dev.spacy" vectors = null init_tok2vec =…
Vedant Jumle
  • 133
  • 2
  • 11
3
votes
1 answer

How to solve the problem of importing when trying to import 'SentenceSegmenter' from 'spacy.pipeline' package?

ImportError: cannot import name 'SentenceSegmenter' from 'spacy.pipeline' Spacy version: 3.2.1 I know this class is for an earlier version of spacy, but would it have something similar for this version of spacy?
arthuraa98
  • 73
  • 4
1
2
3
22 23