0

In the python docs, it says that starmap blocks until the result is ready. Does this mean that we can safely update a variable in main process by the results of child processes like this ?

from multiprocessing import Pool, cpu_count
from multiprocessing import Process, Manager

all_files = list(range(100))

def create_one_training_row(num):
    return num 

def process():
    all_result = []
    with Pool(processes=cpu_count()) as pool:
        for item in pool.starmap(create_one_training_row, zip(all_files)):
            all_result.append(item)
    return all_result

if __name__ == '__main__':
    ans = process()
    print(ans)
    print(sum(ans))
Saurabh Verma
  • 6,328
  • 12
  • 52
  • 84
  • Almost certainly answer is yes, since there is nothing wrong with your code, but what do you mean by "safely update"? What exactly are you concerned about? – ken Feb 22 '22 at 09:06
  • I'm concerned about the values in all_result being overwritten by different processes. eg What if 2 or more processes try to perform all_result.append(item) at the same time ? – Saurabh Verma Feb 22 '22 at 09:44
  • It won't happen. What `starmap` does is to execute the function in child processes and copy result values from child processes. So, `item` is always a local variable, and `all_result.append` is always executed only in the main process. The same can be said for the async method, so whether it is blocking or not is irrelevant. Actually, in multiprocessing, it is not possible to overwrite variables in other process except for shared objects such as `multiprocessing.Value`. – ken Feb 22 '22 at 10:53
  • 2
    As an aside: Why don't you just call `pool.map(create_one_training_row, all_files)`, which is simpler and less expensive since your worker function, `create_one_training_row`, only takes a single argument? `starmap` is useful when your worker function takes multiple arguments. – Booboo Feb 22 '22 at 11:57

0 Answers0