26

I'm just started with python gevent and I was wondering about the cpu / mulitcore usage of the library.

Trying some examples doing many requests via the monkeypatched urllib I noticed, that they were running just on one core using 99% load.

How can I use all cores with gevent using python? Is there best practice? Or are there any side-effects using multiple processes and gevent?

BR dan

thesonix
  • 3,200
  • 4
  • 20
  • 19

1 Answers1

61

Gevent gives you the ability to deal with blocking requests. It does not give you the ability to run on multi-core.

There's only one greenlet (gevent's coroutine) running in a python process at any time. The real benefit of gevent is that it is very powerful when it deals with I/O bottlenecks (which is usually the case for general web apps, web apps serving API endpoints, web-based chat apps or backend and, in general, networked apps). When we do some CPU-heavy computations, there will be no performance-gain from using gevent. When an app is I/O bound, gevent is pure magic.

There is one simple rule: Greenlets get switched away whenever an I/O-operation would block or when you do the switch explicitly (e.g. with gevent.sleep() )

The built-in python threads actually behave in the same (pseudo) "concurrent" way as gevent's greenlets.

The key difference is this - greenlets use cooperative multitasking, where threads use preemptive multitasking. What this means is that a greenlet will never stop executing and "yield" to another greenlet unless it uses certain "yielding" functions (like gevent.socket.socket.recv or gevent.sleep).

Threads, on the other hand, will yield to other threads (sometimes unpredictably) based on when the operating system decides to swap them out.

And finally, to utilize multi-core in Python - if that's what you want - we have to depend on the multiprocessing module (which is a built-in module in Python). This "gets around GIL". Other alternatives include using Jython or executing tasks in parallel (on different CPUs) using a task queue, e.g. Zeromq.

I wrote a very long explanation here - http://learn-gevent-socketio.readthedocs.org/en/latest/. If you care to dive into the details. :-D

Calvin Cheng
  • 35,640
  • 39
  • 116
  • 167
  • 3
    I might add that depending on the case, running several Python processes with gevent could be a good solution. Of course this is not an option in case the processes need to communicate with each other in a significant amount. – ferrix Mar 25 '13 at 14:51
  • Thanks for the answer. What I want is to do as many I/O operations as possible on the machine. The question is: am I capable of doing more requests using n processes (not threads) where n=cpu_cores or is gevent with one process as fast as it can get? – thesonix Mar 25 '13 at 16:07
  • 1
    Here's how trunk.ly (in the words of Alex Dong) dealt with an I/O-bound AND CPU-bound problem (they crawl sites and then place the crawled content in a search index) - https://groups.google.com/d/msg/gevent/4hR1P6Vd-uk/4A4bw5ynuucJ – Calvin Cheng Mar 26 '13 at 03:00
  • 1
    Also @thesonix - check out http://gehrcke.de/gipc/ - it may be what you are looking for "The usage of multiple processes in the context of gevent in principal can be a decent solution whenever a generally I/O-limited Python application needs to distribute tasks among multiple CPUs in parallel. However, naive usage of Python’s multiprocessing package within a gevent-powered application may raise various problems and most likely breaks the application in many ways. gipc is developed with the motivation to solve these issues transparently ..." – Calvin Cheng Mar 26 '13 at 03:07