7

I have heard of Python's GIL problem, which states that there can only be one Python thread executing the Python bytecode at one time in a multicore machine. So a multithreaded Python program is not a good idea.

I am wondering if I can write a C extension that uses pthread to potentially boost the performance of my program. I'd like to create a thread in my C extension and make it runs in parallel with Python's main thread.

My assumption is that the Python main thread will do things more related to IO, while the pthread in my C extension will spend most of its time doing computation. The Python main thread communicates with the thread in the C extension using a queue (like a producer consumer model).

Is there any difference between multithread with Python and C extension?

snakecharmerb
  • 47,570
  • 11
  • 100
  • 153
ming.kernel
  • 3,365
  • 3
  • 21
  • 32
  • I’ve absolutely no knowledge of multithreaded programming, but does [Python 2.6’s `multiprocessing module`](http://docs.python.org/library/multiprocessing.html) make any difference to your thoughts? – Paul D. Waite Sep 28 '12 at 07:48
  • 1
    @PaulD.Waite I just played with multiprocessing module, I think it can avoid the GIL problem. I will have a try to see how it goes. – ming.kernel Sep 28 '12 at 08:07
  • 1
    In [this keynote speech at PyCon 2012](http://youtu.be/EBRMq2Ioxsc) [Guido van Rossum](http://www.python.org/~guido/) states that the "GIL problem" is no longer as major an issue as it once was. Before thinking about complex solutions like threaded C extensions, which will hurt the readability and maintainability of your code, verify that the GIL is actually a bottleneck in your case. I suspect that something like the standard library multiprocessing module will be sufficient, as suggested by others here. – Chris Sep 28 '12 at 08:26

3 Answers3

2

To answer your original question:

Yes, C extensions can be immune from the GIL, provided they do not call any Python API functions without the GIL held. So, if you need to communicate with the Python app, you'd need to acquire the GIL to do so. If you don't want to get your hands too dirty with the C API, you can use ctypes to call a C library (which can just use pthreads as usual), or Cython to write your C extension in a Python-like syntax.

nneonneo
  • 171,345
  • 36
  • 312
  • 383
  • "provided they do not call any Python API functions without the GIL held" you mean that c extensions do not call any Python API that holds the GIL, so it can be run without a GIL, right? – ming.kernel Sep 28 '12 at 08:25
  • Calling any of the Python API functions requires that you hold the GIL (it's a lock), and that forces Python programs to run only one thread at a time. If you release the lock, you cannot call Python API functions, but you are free to run as many simultaneous threads as you want. – nneonneo Sep 28 '12 at 08:28
  • thanks for your clarify, I guess you mean that ctypes has already handle the acquire and release of the GIL internally. – ming.kernel Sep 28 '12 at 08:34
  • `ctypes` can call a C function, and the C function will run without the GIL. If the C function calls a `ctypes`-provided callback (a Python function), the GIL will be reacquired before the function is called. Thus, you can actually spawn off a thread, call to `ctypes` with a Python function as a callback, and be notified through that callback when things finish. – nneonneo Sep 28 '12 at 08:39
1

The Python interpreter is not aware of C launched threads in any way, so they can happily churn their own CPU time.

However I doubt this is a correct solution for your performance problems. First try using multiple processes with multiprocess module. If interprocess IO is too much after that you can result trickery like C threads. This would make your program an order of magnitude more complex so avoid it if possible.

Mikko Ohtamaa
  • 82,057
  • 50
  • 264
  • 435
0

Provided one thread runs CPU bound while the other runs IO bound, I don't see a problem.

The IO bound thread will call IO routines which usually release the GIL while doing their stuff, effectively allowing the other thread to run.

So give the "simple" solution a try and switch only if it really doesn't work the way you want it to.

glglgl
  • 89,107
  • 13
  • 149
  • 217