3

Good day everyone,

I'm using latest Spacy release (3.2.0), which has multiprocessing feature https://spacy.io/usage/processing-pipelines

import spacy
import time

nlp = spacy.load("en_core_web_trf")

start_time = time.time()
for doc in nlp.pipe(docs, n_process=16, disable=["tok2vec", "tagger", "parser", "attribute_ruler", "lemmatizer"]):
     # other actions
print("--- %s seconds --- " % (time.time() - start_time))

No matter what values i put for n_process - execution time is the same. Looking through Process Explorer - no additional python processes are spawned, code executes only in one kernel.

Any thoughts how to enable multithreading for such tasks?

Anton
  • 919
  • 7
  • 22

0 Answers0