Is it true that in multiprocessing, each process gets it's own GIL in CPython? How different is that from creating new runtimes?

Question

Are there any caveats to it? I have a few questions related to it.

How costly is it to create more GILs? Is it any different from creating a separate python runtime? Once a new GIL is created, will it create everything (objects, variables, stack, heap) from scratch as required in that process or a copy of everything in the present heap and the stack is created? (Garbage collection would malfunction if they are working on same objects.) Are the pieces of code being executed also copied to new CPU cores? Also can i relate one GIL to one CPU core?

Now copying things is a fairly CPU intensive task (correct me if I am wrong), what would be the threshold to decide whether to go for multiprocessing?

PS: I am talking about CPython but please feel free to extend the answer to whatever you feel is necessary.

sprksh · Accepted Answer · 2020-08-16T03:45:42.093

Looking back at this question after 6 months, I feel I can clarify the doubts of my younger self. I hope this would be helpful to people who stumble upon it.

Yes, It is true that in multiprocessing module, each process has a separate GIL and there are no caveats to it. But the understanding of the runtime and GIL is flawed in the question which needs to be corrected.

I will clear the doubts/ answer the questions with a series of statements.

Python code is ran (compiled to Cpython bytecode and then this bytecode interpreted) by CPython virtual machine. This is what constitutes the python runtime.
When we create a new process, an entire new python virtual machine is launched (which we call the python process) with the stack and the heap memory.
Yes this is a costly process but not too costly. Because python virtual machine is piece of C code precompiled to machine code. To put in perspective, the reason that in java they do not use multiprocessing is that it will create multiple JVMs which would be terrible as JVM needs a lot of memory and also, JVM is not precompiled machine code like CPython.
GIL is just a piece of code within the python virtual machine which lets the CPython interpreter execute only one line of CPython bytecode (or one instruction) at a time. So, all questions related to GIL creation and cost are dumb. Basically the intention was to ask about CPython Virtual Machine.
Can I relate 1 GIL to 1 CPU core? : Better to ask if 1 Python process can be related to 1 CPU core? : No. That's Kernel's job to decide what core the process is running (and which will keep changing from time to time and the process would have no control over it). The only thing is that at any give point of time, one python process cannot be running on multiple cores and one python process will execute only one instruction in CPython bytecode (due to the GIL).

What's copied in cores and how the OS tries to keep a process hold the Core it is working on is a separate ans very deep topic in itself.

The final question is a subjective one but with all this understanding, it's basically a cost to benefit ratio that may vary from program to program and might depend on how CPU intensive a process is and how many cores does the machine has etc. So that cannot be generalised.

Ondrej K. · Answer 2 · 2020-08-16T13:31:14.143

The short answer to the first title question is: Yes. Each process has its own Global Interpret Lock. After that, it gets complicated and not really as much a Python matter as it is a question for your underlying OS.

On Linux, it should be cheaper to spawn new processes through multiprocessing rather than starting a new Python interpreter (from scratch):

you fork() the parent process (side note: clone() is actually used these days), the child already has your code and starts with a copy of parents address space -> since you are actually spawning another instance of your running process no need to execve() (and all the overhead associated with that) and repopulate its content.
actually when we say copy of the address space, it actually does not get all copied, but would rather use copy-on-write; so unless you modified it, you do not need to copy it at all.

For that reason my gut feeling would be, multiprocessing would almost always be more efficient than starting a completely new Python interpreter from scratch. After all, even you started a new interpreter (presumably from the running process), it first performs the fork()/clone() including "copy" of the parent's address space before moving onto execve().

But really this may vary and depends on how your underlying OS handles creation of new processes as well as its memory management.

Is it true that in multiprocessing, each process gets it's own GIL in CPython? How different is that from creating new runtimes?

2 Answers2