Good day everyone,
I'm using latest Spacy release (3.2.0), which has multiprocessing feature https://spacy.io/usage/processing-pipelines
import spacy
import time
nlp = spacy.load("en_core_web_trf")
start_time = time.time()
for doc in nlp.pipe(docs, n_process=16, disable=["tok2vec", "tagger", "parser", "attribute_ruler", "lemmatizer"]):
# other actions
print("--- %s seconds --- " % (time.time() - start_time))
No matter what values i put for n_process
- execution time is the same. Looking through Process Explorer - no additional python processes are spawned, code executes only in one kernel.
Any thoughts how to enable multithreading for such tasks?