Questions tagged [spacy-3]

For questions specific to spaCy version 3, an Industrial-Strength Natural Language Processing tool in Python. Use the more generic tag `spacy` for general questions about the spaCy.

334 questions
3
votes
0 answers

Enable multiprocessing on CPU in Spacy pipeline

Good day everyone, I'm using latest Spacy release (3.2.0), which has multiprocessing feature https://spacy.io/usage/processing-pipelines import spacy import time nlp = spacy.load("en_core_web_trf") start_time = time.time() for doc in…
Anton
  • 919
  • 7
  • 22
3
votes
1 answer

Spacy: count occurrence for specific token in each sentence

I want to count the occurrence of the token and for each sentence in a corpus using spacy and append the result for each sentence to a list. Until now the code bellow returns the total number (for the whole corpus) regarding and. Example/Desired…
Artemis
  • 145
  • 7
3
votes
1 answer

Unexpected result using Spacy regex

I find an unexpected result matching regular expresions using Spacy (version 3.1.3). I define a simple regex to identify a digit. Then I create strings made of a digit and a letter and try to identify then. Everything work as expected but with…
3
votes
2 answers

How to use Hugging Face transfomers with spaCy 3.0

Let's say that I want to include distilbert https://huggingface.co/distilbert-base-uncased from Hugging Face into spaCy 3.0 pipeline. I think that this is possible and I found some code on how to convert this model for spaCy 2.0 but it doesn't work…
EnesZ
  • 403
  • 3
  • 16
3
votes
2 answers

spacy 3 train custom ner model

I tried to train datasets : [('text data text data text data text data text data text data text data text data.', {'entities': [(7, 19, 'PERSON'), (89, 91, 'PERSON'), (98, 101, 'PERSON')]}), ('"text data text data text data text data text data text…
zzzz
  • 71
  • 7
3
votes
2 answers

No vector when Using spacy.load('en_core_web_trf')?

After running this nlp = spacy.load('en_core_web_lg') has_vector = nlp('test text').has_vector # ... ...has_vector == True But after running this nlp = spacy.load('en_core_web_trf') has_vector = nlp('test text').has_vector # ... ...has_vector ==…
VikR
  • 4,818
  • 8
  • 51
  • 96
3
votes
1 answer

Visualizing customized IOB tags with SpaCy DisplaCy

I have a text file that I have created a DOC object from using SpaCy: doc = nlp.make_doc(raw_text) I also have a list of customized IOP tags for each word in this DOC object: ['O', 'B-PER', 'I-PER', 'O', 'O', 'O', 'O', 'B-DATE', 'I-DATE', 'I-DATE',…
BlueBlue
  • 81
  • 1
  • 8
3
votes
3 answers

Training epochs interpretation during spaCy NER training

I Was training my NER model with transformers, and am not really sure why the training stopped at some point, or why did it even go with so many batches. This is how my configuration file looks like (relevant part): [training] train_corpus =…
archity
  • 562
  • 3
  • 11
  • 22
3
votes
1 answer

Getting similar words no longer working in spacy

I have a Google Colab notebook from a while ago which uses spacy 2.2.4 and successfully gets the most similar words for a list of words: import spacy import spacy.cli spacy.cli.download("en_core_web_lg") import en_core_web_lg nlp =…
code_to_joy
  • 569
  • 1
  • 9
  • 27
3
votes
2 answers

How to evaluate trained spaCy version 3 model?

I want to evaluate my trained spaCy model with the build-in Scorer function with this code: def evaluate(ner_model, examples): scorer = Scorer() for input_, annot in examples: text = nlp.make_doc(input_) gold =…
Malte
  • 329
  • 3
  • 14
3
votes
3 answers

Given a word can we get all possible lemmas for it using Spacy?

The input word is standalone and not part of a sentence but I would like to get all of its possible lemmas as if the input word were in different sentences with all possible POS tags. I would also like to get the lookup version of the word's…
PSK
  • 347
  • 2
  • 13
3
votes
1 answer

Spacy - Use two trainable components with two different datasets

I was wondering if it is possible to train two trainable components in Spacy with two different datasets ? In fact, I would like to use the NER and the text classifier but since the train datasets for these two components should be annotated…
JulienD
  • 35
  • 5
3
votes
1 answer

Packaging SpaCy Model with Pyinstaller: E050 Can't find model

I'm using Pyinstaller to pack my python spacy code. I'm using the de_core_news_sm and installed it via pip. The normal script performs as expected but as soon as it is packaged with pyinstaller it can not find the model [E050] Can't find model…
jre_joe
  • 61
  • 4
3
votes
1 answer

Migrate trained Spacy 2 pipelines to Spacy 3

I've been using spacy 2.3.1 until now and have trained and saved a couple of pipelines for my custom Language class. But now using spacy 3.0 and spacy.load('model-path') I'm facing problems such as config.cfg file not found and other kinds of…
Marzi Heidari
  • 2,660
  • 4
  • 25
  • 57
3
votes
1 answer

Memory leak with en_core_web_trf model, Spacy

there is a Memory leak when using pipe of en_core_web_trf model, I run the model using GPU with 16GB RAM, here is a sample of the code. !python -m spacy download en_core_web_trf import en_core_web_trf nlp = en_core_web_trf.load() #it's just an…
moro clash
  • 199
  • 1
  • 2
  • 6
1 2
3
22 23