Python multiprocessing can't use functions from other module

Question

Update: it's working after updating my Spyder to 5.0.5. Thanks everyone!

I am trying to speed up a loop using multiprocessing. The code below aims to generate 10000 random vectors.

My idea is to split the task into 5 processes and store it in result. However, it returned an empty list when I run the code.

But, if I remove result = add_one(result) in the randomize_data function, the code runs perfectly. So, the error must be coming from using functions from other modules (Testing.test) inside multiprocessing.

Here is the add_one function from Testing.test:

def add_one(x):
    return x+1

How can I use function from other modules inside process? Thank you.

import multiprocessing
import numpy as np
import pandas as pd

def randomize_data(mean, cov, n_init, proc_num, return_dict):
    result = pd.DataFrame()
    for _ in range(n_init):
        temp = np.random.multivariate_normal(mean, cov)
        result = result.append(pd.Series(temp), ignore_index=True)
    
    result = add_one(result)
    return_dict[proc_num] = result

if __name__ == "__main__":

    from Testing.test import add_one

    mean = np.arange(0, 1, 0.1)
    cov = np.identity(len(mean))
    
    manager = multiprocessing.Manager()
    return_dict = manager.dict()
    jobs = []
    
    for i in range(5):
        p = multiprocessing.Process(target=randomize_data, args=(mean, cov, 2000, i, return_dict, ))
        jobs.append(p)
        p.start()
    
    for proc in jobs:
        proc.join()
    
    result = return_dict.values()

score 0 · Answer 1 · answered Sep 07 '21 at 06:27

0

The issue here is pretty obvious: You imported add_one in a local scope, not in global. Because of this, the referenz to this function only exists inside your main-if. Move this import-statement to the other ones to the top of your file, and your code should work.

import multiprocessing
import numpy as np
import pandas as pd
from Testing.test import add_one

answered Sep 07 '21 at 06:27

Dominik Lovetinsky

466
5
15

Hmm, now it takes forever to compile. My spyder has not done compiling the program since you answered. Is there a version issue here? I'm using Python 3.7.6 – Meinung Sep 07 '21 at 06:32
Well this seem to make currently no sense at all. Python version should not be relevant for this. Any endless loop somewhere in your code? – Dominik Lovetinsky Sep 07 '21 at 06:39
No, I run the exact same code I posted here. I noticed something funny tho, if I open Spyder and run this code for the first time, compiling takes forever. But if I restart my Spyder console and run this code again, it runs just fine. Weird – Meinung Sep 07 '21 at 06:53
Try another IDE, maybe its a Spyder issue. But for me, the code with my answer embedded also works and terminates. – Dominik Lovetinsky Sep 07 '21 at 10:17
No, your explanation is not correct. After the statement `from Testing.test import add_one` is executed in the OP's original code, `add_one` *should* become available for `randomize_data` to call. If you print out `dir(sys.modules['__main__'])` you will see both `randomize_data` and `add_one`. – Booboo Sep 07 '21 at 12:01
Update: it's working after updating my Spyder to 5.0.5. Thanks everyone! – Meinung Sep 25 '21 at 13:10

Python multiprocessing can't use functions from other module

1 Answers1