Why Multiprocessing cost the same time as normal way

Question

I am new to python and multiprocess or multi thread,Here is my question,I tired to use multiprocessing module in python,I follow the guide and create two s separate process, place a function into each process,and run it,record the time,then I find that the time it cost does not become less, I was wondering why,here is my code:

import multi processing
import time
start = time.time()

def mathwork():
    print(sum(j * j for j range(10 ** 7)))

if__name__ ==‘__main__’:
    process1 = multiprocessing.Process(name = ‘process1’,target = mathwork)
    process2 = multiprocessing.Process(name = ‘process2’,target = mathwork)
    process1.start()
    process2.start()

end = time.time()
print(end-start)

you have 2 issues. first issue is that you arent waiting for you processes to end so it is very possible that your main program finishes before its children. the second problem is the logic, you arent splitting your task between 2 processes, you are just telling 2 processes to do the same task twice so your execution time can only go UP and not down — Nullman, Aug 01 '19 at 08:48
Thanks for your comment,the second issue I could have done it by def a different function,but I don’t understand the first issue,how can I modify my code — Zhou XF, Aug 01 '19 at 08:53
In the future, please post your *actual* code. What you have shown has several SyntaxErrors for uppercase keywords (e.g. it should say ``import`` instead of ``Import``). — MisterMiyagi, Aug 01 '19 at 09:00
Sorry,it’s just typing on ipad is not easy as typing the keyboard — Zhou XF, Aug 01 '19 at 09:09
@ZhouXF Presumably, you ran your code somewhere. Why not copy it from there instead of retyping it? — phihag, Aug 01 '19 at 09:16
Cause the computer I use to ran the code does not have internet connection,btw,thank you for your answer@phihag — Zhou XF, Aug 01 '19 at 09:24

Serdalis · Accepted Answer · 2019-08-01T09:34:12.013

I'm going to assume that the code you posted was messed with by some text editor.
I'll answer your question using the example below:

import multiprocessing
import time
start = time.time()

def mathwork():
    print(sum(j * j for j in range(10 ** 7)))

if __name__ =='__main__':
    process1 = multiprocessing.Process(name = 'process1',target = mathwork)
    process2 = multiprocessing.Process(name = 'process2',target = mathwork)
    process1.start()
    process2.start()  
    end = time.time()
    print(end-start)

The reason your code takes just as long to complete, no matter what the threads are doing, is that you aren't waiting for your threads to complete before printing out the time.

To wait for your processes to finish you have to use the join function on them, which will create the following snippet:

if __name__ =='__main__':
    process1 = multiprocessing.Process(name = 'process1',target = mathwork)
    process2 = multiprocessing.Process(name = 'process2',target = mathwork)
    process1.start()
    process2.start()
    process1.join()
    process2.join()
    end = time.time()
    print(end-start)

You'll notice that the time is now larger when you're running the processes, because your code is now waiting for them to finish and return.

As an interesting aside (Now found out to be due To this quirk between windows and unix), if your print statement was outside the __name__ == '__main__' check, you would print times for each process you ran, because it loaded the file again to get the function definition.

With this method I get:

4.772554874420166 # single execution ( 2 functions in main )
2.486908197402954 # multi processing ( threads for each function )

Can you cite a source for your _interesting aside_? On my system, Python simply forks; there's no need to parse any source again. Also, how would that work if the target were a lambda or used a closure? — phihag, Aug 01 '19 at 09:16
@phihag In all honestly, when I put the print outside of the main check I get 3 prints, 2 being 0.0. When I put it inside the main check, I get 1 print. Though I may be mis-diagnosing, it might be a quirk of my system. — Serdalis, Aug 01 '19 at 09:21
In fact,when I put the print statement outside the __main__function,I get the same circustance as you do @Serdalis, I am running this code in pycharm — Zhou XF, Aug 01 '19 at 09:28
@phihag For more information I added the `__name__` to the print and got: `0.0 __mp_main__` `0.0 __mp_main__` `0.10295414924621582 __main__` [This question](https://stackoverflow.com/questions/43545179/why-does-importing-module-in-main-not-allow-multiprocessig-to-use-module) appears to answer our concerns. I am using windows, thus a new python process has to be spooled up to multithread... That's actually quite cool. — Serdalis, Aug 01 '19 at 09:29

score 2 · Answer 2 · answered Aug 01 '19 at 08:54

You measure the time it takes to start the processes, not the time it takes to run them. Wait for the processes to finish by calling join, like this:

import multiprocessing
import time

def mathwork():
    sum(j * j for j in range(10 ** 7))

if __name__ == '__main__':
    start = time.time()
    process1 = multiprocessing.Process(name='process1', target=mathwork)
    process2 = multiprocessing.Process(name='process2', target=mathwork)
    process1.start()
    process2.start()
    process1.join()
    process2.join()
    print('multiprocessing: %s' % (time.time() - start))

    start = time.time()
    mathwork()
    mathwork()
    print('one process: %s' % (time.time() - start))

On my system, the output is:

multiprocessing: 0.9190812110900879
one process: 1.8888437747955322

Showing that indeed, multiprocessing makes this computation go twice as fast.

Why Multiprocessing cost the same time as normal way

2 Answers2