-3

I was recently learning multiprocessing using Python. What I learned is that we use multiprocessing module in Python to achieve parallelism, which means that the processes are executed at the same time.

But why below codes showing that there are a few milliseconds difference among the starting time of the four processes?

from multiprocessing import Process
import os, time, datetime, random, tracemalloc

tracemalloc.start()
children = 4    # number of child processes to spawn
maxdelay = 6    # maximum delay in seconds

def status():
    return ('Time: ' + 
        str(datetime.datetime.now().time()) +
        '\t Malloc, Peak: ' +
        str(tracemalloc.get_traced_memory()))

def child(num):
    delay = random.randrange(maxdelay)
    print(f"{status()}\t\tProcess {num}, PID: {os.getpid()}, Delay: {delay} seconds...")
    time.sleep(delay)
    print(f"{status()}\t\tProcess {num}: Done.")

if __name__ == '__main__':
    print(f"Parent PID: {os.getpid()}")
    for i in range(children):
        proc = Process(target=child, args=(i,))
        proc.start()

Below is the output:

Parent PID: 16048
Time: 09:52:47.014906    Malloc, Peak: (228400, 240036)     Process 0, PID: 16051, Delay: 1 seconds...
Time: 09:52:47.016517    Malloc, Peak: (231240, 240036)     Process 1, PID: 16052, Delay: 4 seconds...
Time: 09:52:47.018786    Malloc, Peak: (231616, 240036)     Process 2, PID: 16053, Delay: 3 seconds...
Time: 09:52:47.019398    Malloc, Peak: (232264, 240036)     Process 3, PID: 16054, Delay: 2 seconds...

Time: 09:52:48.017104    Malloc, Peak: (228434, 240036)     Process 0: Done.
Time: 09:52:49.021636    Malloc, Peak: (232298, 240036)     Process 3: Done.
Time: 09:52:50.022087    Malloc, Peak: (231650, 240036)     Process 2: Done.
Time: 09:52:51.020856    Malloc, Peak: (231274, 240036)     Process 1: Done.

Why the starting time of the processes differ? Isn't this against the definition of parallelism?

user3666197
  • 1
  • 6
  • 50
  • 92
Parry Wang
  • 115
  • 8
  • 4
    They *run* in parallel. That doesn't mean they *start* in parallel, since they are all being started by the same process. – chepner Jan 04 '21 at 13:41
  • @chepner, thanks, does that mean if I start another parent process, then they will be started at the same time? – Parry Wang Jan 04 '21 at 14:24
  • You may notice, that a "just"-[CONCURRENT] process-scheduling is the type of system behaviour here. If in doubts, launch 125-processes in the very same manner & the self-evident results will document, that the True-[PARALLEL] process-flow was not achieved, as the resources simply do not permit such breadth of parallel co-execution and system resorts to a "just"-[CONCURRENT] ( a best-effort within user effective-rights & valid O/S-priorities taken into account ), which is by definition by far ***not*** a True-[PARALLEL] orchestration. – user3666197 Jan 04 '21 at 22:43
  • I don't see why it matters if the start times are staggered. If you need them to run *precisely* together, then you need to implement some sort of synchronization protocol between them. Parallelism implies some degree of independence. – chepner Jan 05 '21 at 12:46

1 Answers1

1

To add to what @chepner said, you're launching the processes one at a time in a for loop, so they'll be launched sequentially. That's not the same as saying that they're not running in parallel.

In fact, they are being executed in parallel because the processes don't end in the same order in which they were launched. If they were being executed sequentially, they would also end in the order in which you launched them.

Instead, the processes ended exactly in order as the amount of their delay. Process 0 has a delay of 1 second, so it ended first. Process 3 has a delay of 2 seconds and ended second. Process 2 has a delay of 3 seconds and ended third. Finally, process 1 has a delay of 4 seconds and ended fourth. This indicates that they are, in fact, running in parallel.

  • Thanks for your detailed explanation, is there anyway to make them start at the same time? Should I use another mechanism instead of for loop? – Parry Wang Jan 04 '21 at 14:01
  • @ParryWang Since the Python program starts on a single thread initially, you'll have to create processes sequentially at *some* point. (You *could* create threads/processes that spawn other threads/processes, but you'd have to create those threads sequentially instead). – EJoshuaS - Stand with Ukraine Jan 04 '21 at 14:11