For questions specific to spaCy version 3, an Industrial-Strength Natural Language Processing tool in Python. Use the more generic tag `spacy` for general questions about the spaCy.
Questions tagged [spacy-3]
334 questions
3
votes
0 answers
Enable multiprocessing on CPU in Spacy pipeline
Good day everyone,
I'm using latest Spacy release (3.2.0), which has multiprocessing feature https://spacy.io/usage/processing-pipelines
import spacy
import time
nlp = spacy.load("en_core_web_trf")
start_time = time.time()
for doc in…

Anton
- 919
- 7
- 22
3
votes
1 answer
Spacy: count occurrence for specific token in each sentence
I want to count the occurrence of the token and for each sentence in a corpus using spacy and append the result for each sentence to a list. Until now the code bellow returns the total number (for the whole corpus) regarding and.
Example/Desired…

Artemis
- 145
- 7
3
votes
1 answer
Unexpected result using Spacy regex
I find an unexpected result matching regular expresions using Spacy (version 3.1.3).
I define a simple regex to identify a digit. Then I create strings made of a digit and a letter and try to identify then. Everything work as expected but with…

Alfonso Sastre
- 53
- 4
3
votes
2 answers
How to use Hugging Face transfomers with spaCy 3.0
Let's say that I want to include distilbert https://huggingface.co/distilbert-base-uncased from Hugging Face into spaCy 3.0 pipeline. I think that this is possible and I found some code on how to convert this model for spaCy 2.0 but it doesn't work…

EnesZ
- 403
- 3
- 16
3
votes
2 answers
spacy 3 train custom ner model
I tried to train datasets :
[('text data text data text data text data text data text data text data text data.',
{'entities': [(7, 19, 'PERSON'), (89, 91, 'PERSON'), (98, 101, 'PERSON')]}),
('"text data text data text data text data text data text…

zzzz
- 71
- 7
3
votes
2 answers
No vector when Using spacy.load('en_core_web_trf')?
After running this
nlp = spacy.load('en_core_web_lg')
has_vector = nlp('test text').has_vector
# ...
...has_vector == True
But after running this
nlp = spacy.load('en_core_web_trf')
has_vector = nlp('test text').has_vector
# ...
...has_vector ==…

VikR
- 4,818
- 8
- 51
- 96
3
votes
1 answer
Visualizing customized IOB tags with SpaCy DisplaCy
I have a text file that I have created a DOC object from using SpaCy:
doc = nlp.make_doc(raw_text)
I also have a list of customized IOP tags for each word in this DOC object:
['O', 'B-PER', 'I-PER', 'O', 'O', 'O', 'O', 'B-DATE', 'I-DATE', 'I-DATE',…

BlueBlue
- 81
- 1
- 8
3
votes
3 answers
Training epochs interpretation during spaCy NER training
I Was training my NER model with transformers, and am not really sure why the training stopped at some point, or why did it even go with so many batches. This is how my configuration file looks like (relevant part):
[training]
train_corpus =…

archity
- 562
- 3
- 11
- 22
3
votes
1 answer
Getting similar words no longer working in spacy
I have a Google Colab notebook from a while ago which uses spacy 2.2.4 and successfully gets the most similar words for a list of words:
import spacy
import spacy.cli
spacy.cli.download("en_core_web_lg")
import en_core_web_lg
nlp =…

code_to_joy
- 569
- 1
- 9
- 27
3
votes
2 answers
How to evaluate trained spaCy version 3 model?
I want to evaluate my trained spaCy model with the build-in Scorer function with this code:
def evaluate(ner_model, examples):
scorer = Scorer()
for input_, annot in examples:
text = nlp.make_doc(input_)
gold =…

Malte
- 329
- 3
- 14
3
votes
3 answers
Given a word can we get all possible lemmas for it using Spacy?
The input word is standalone and not part of a sentence but I would like to get all of its possible lemmas as if the input word were in different sentences with all possible POS tags. I would also like to get the lookup version of the word's…

PSK
- 347
- 2
- 13
3
votes
1 answer
Spacy - Use two trainable components with two different datasets
I was wondering if it is possible to train two trainable components in Spacy with two different datasets ?
In fact, I would like to use the NER and the text classifier but since the train datasets for these two components should be annotated…

JulienD
- 35
- 5
3
votes
1 answer
Packaging SpaCy Model with Pyinstaller: E050 Can't find model
I'm using Pyinstaller to pack my python spacy code. I'm using the de_core_news_sm and installed it via pip.
The normal script performs as expected but as soon as it is packaged with pyinstaller it can not find the model [E050] Can't find model…

jre_joe
- 61
- 4
3
votes
1 answer
Migrate trained Spacy 2 pipelines to Spacy 3
I've been using spacy 2.3.1 until now and have trained and saved a couple of pipelines for my custom Language class. But now using spacy 3.0 and spacy.load('model-path') I'm facing problems such as config.cfg file not found and other kinds of…

Marzi Heidari
- 2,660
- 4
- 25
- 57
3
votes
1 answer
Memory leak with en_core_web_trf model, Spacy
there is a Memory leak when using pipe of en_core_web_trf model, I run the model using GPU with 16GB RAM, here is a sample of the code.
!python -m spacy download en_core_web_trf
import en_core_web_trf
nlp = en_core_web_trf.load()
#it's just an…

moro clash
- 199
- 1
- 2
- 6