I am running multiple processes with Pool
import spacy
import multiprocessing
import logging
# global variable
nlp_bert = spacy.load("en_trf_bertbaseuncased_lg")
logging.basicConfig(level=logging.DEBUG)
def job_pool(data, job_number, job_to_do, groupby=None, split_col=None, **kwargs):
pool = multiprocessing.Pool(processes=job_number)
jobs = pool.map(job_to_do, data)
return jobs
def job(slice):
logging.debug('this shows')
w1 = nlp_bert('word')
w2 = nlp_bert('other')
logging.debug(w1.similarity(w2))
logging.debug("this doesn't")
job_pool([1, 2, 3, 4], 4, job)
The nlp_bert function does not return anything and there is no error. How can I find out what is going wrong? I have logging set to debug level already.
The function works outside of multiprocess - i.e. just writing it in a script and running the following.
import spacy
nlp_bert = spacy.load("en_trf_bertbaseuncased_lg")
w1 = nlp_bert('word')
w2 = nlp_bert('other')
print(w1.similarity(w2))
0.8381155446247196
I'm using:
- Python 3.8.2
- spacy Version: 2.3.2