Questions tagged [spacy-3]

For questions specific to spaCy version 3, an Industrial-Strength Natural Language Processing tool in Python. Use the more generic tag `spacy` for general questions about the spaCy.

334 questions
2
votes
1 answer

How to set an extension attribute to a Doc object in spaCy so that it can be retrieved from a slice (Span) of the Doc?

I want to add an extension attribute to a spaCy doc that spans one or more tokens, similar to the entity attribute, so that it could also be accessed when looking at a span which contains that attribute. To clarify, below I set a list containing a…
2
votes
1 answer

SPACY Negation Operator in Pattern Matching

I am trying to write a pattern in spaCy that matches against "black" but not "black beans." I tried the code below, but it seems to match the token that is next to "black" so long as it is not "bean." How do I modify to match against only…
D Patel
  • 55
  • 5
2
votes
1 answer

DocBin to_bytes/to_disk gets killed

I am dealing with fairly big corpuses and my DocBin object gets killed when I try to save it. Both to_disk and to_bytes are printing "Killed". I am with limited python knowledge, so it isn't obvious to me right away how I can work around the issue.…
Tech0
  • 253
  • 3
  • 16
2
votes
1 answer

Value Error when trying to train a spacy model

I tried training a spacy model but recently I started to get some errors , i got the below error and i would like some one to help me resolve error def train_model(model, train_data, optimizer, batch_size, epochs=10): losses = {} …
Dhanush
  • 63
  • 1
  • 8
2
votes
0 answers

SpaCy 3.2.1, How to add new entitie without losing the default ones

I am trying to add new entities to the already existing model en_core_web_lg. If I don't train the model recognize all the entities. In the moment that I train it, it seems to forget all the previous entities. Can someone help me please? Here's the…
2
votes
1 answer

Add entity label to existing SpaCy model

I would like to further train the en_core_web_trf model to recognise another entity tag. I'm using SpaCy's CLI as recommended by them. The training begins with no issues on Google Colab with gpu, I was just wondering if I missed something in my…
Paschalis
  • 191
  • 10
2
votes
1 answer

Spacy: Detect if entity is quoted

I need to detect if a given entity is surrounded by quotes, either single or double quotes. How would I go about this. My first thought was to add a custom extension to the span: def is_quoted(span): prev_token = span.doc[span.start - 1] …
user17795506
2
votes
1 answer

BILOU Tagging scheme for multi-word entities in Spacy's NER

I am working on building a custom NER using spacy for recognizing new entities apart from spacy's NER. Now I have my training data to be tagged and added using spacy.Example. I am using the BILOU scheme. My doubt is that I have entities which have…
2
votes
1 answer

spacy Entity Ruler pattern isn't working for ent_type

I am trying to get the entity ruler patterns to use a combination of lemma & ent_type to generate a tag for the phrase "landed (or land) in Baltimore(location)". It seems to be working with the Matcher, but not the entity ruler I created. I set the…
scarpacci
  • 8,957
  • 16
  • 79
  • 144
2
votes
1 answer

difference between model-best and model-last in spacy

after model training, spacy has generated model\model-best and model\model-last folders. What's the difference between the two models and which one should be used for predictions?
GlobalLearner
  • 47
  • 1
  • 5
2
votes
3 answers

spaCy, preparing training data: doc.char_span returning 'None'

I'm following the instructions in spaCy's documentation to prepare my own training data (here). My problem begins at this line: span = doc.char_span(start, end, label=label) For entities which I'm labelling as an organization ('ORG'), it seems to…
mess1n
  • 53
  • 1
  • 8
2
votes
1 answer

Match only the largest pattern and not sub patterns with Matcher spaCy

I am using the Matcher class and spaCy v3.1.3. to extract patterns from sentences. I have a label called "Inteval" and two patterns that I want to capture. However, the largest pattern (the second "pattern" shown below) is made up of a sub-pattern…
user140259
  • 450
  • 2
  • 5
  • 16
2
votes
1 answer

Number is recognized as a noun in spacy portuguese model

Just out of curiosity I would like to ask why the number "4950" has the PoS (part of speech) of "NOUN" in spaCy v3.1.3, using the large model in Portuguese. It is not in the GitHub token exception file…
user140259
  • 450
  • 2
  • 5
  • 16
2
votes
3 answers

Could not find function 'spacy-transformers.TransformerModel.v3' in function registry 'architectures'

I was trying to create a custom NER model. I used spacy library to create the model. And this line of code is to create the config file from the base.config file. My code is : !python -m spacy init fill-config…
2
votes
1 answer

Spacy 3.1 - KeyError: 'train' using spacy train command

I'm following this tutorial https://spacy.io/usage/training#quickstart in order to train a custom model of distilbert. Everything is already installed, data are converted and the config file is ready. When I launch this training command: python -m…