50

How are greenlets implemented? Python uses the C stack for the interpreter and it heap-allocates Python stack frames, but beyond that, how does it allocate/swap stacks, how does it hook into the interpreter and function call mechanisms, and how does this interact with C extensions? (Any quirks)?

There are some comments at the top of greenlet.c in the source, but they're a bit opaque. FWIW I'm coming from the perspective of someone who is unfamiliar with CPython internals but is very familiar with low-level systems programming, C, threads, events, coroutines/cooperative threads, kernel programming, etc.

(Some data points: they don't use ucontext.h and they do 2x memcpy, alloc, and free on every context switch.)

Yang
  • 16,037
  • 15
  • 100
  • 142

2 Answers2

34

When a python program runs, you have essentially two pieces of code running under the hood.

First, the CPython interpreter C code running and using the standard C-stack to save its internal stack-frames. Second, the actual python interpreted bytecode which does not use the C-stack, but rather uses the heap to save its stack-frames. A greenlet is just standard python code and thus behaves identically.

Now in a typical microthreaded application, you'd have thousands if not millions of microthreads (greenlets) switching all over the place. Each switch is essentially equivalent to a function call with a deferred return (so to speak) and thus will use a bit of stack. Problem is, the C-stack of the interpreter will sooner or later hit a stack overflow. This is exactly what the greenlet extension aimed at, it is designed to move pieces of the stack back and forth to/from the heap in order to avoid this problem.

As you know, there are three fundamental events with greenlets, a spawn, a switch, and a return, so let's look at those in turn:

A) A Spawn

The newly spawned greenlet is associated with its own base address in the stack (where we currently are). Apart from that, nothing special happens. The python code of the newly spawned greenlet uses the heap in a normal way and the interpreter continues using the C-stack as usual.

B) A Switch

When a greenlet is switched to from a switching greenlet, the relevant part of the C-stack (starting from the base address of the switchng greenlet) is copied to the heap. The copied C-stack area is freed and the switched greenlet's interpreter previously saved stack data is copied from the heap to the newly freed C-stack area. The python code of the switched greenlet continues using the heap in a normal way. Of course the extension code keeps track of all of this (which heap section goes to which greenlet and so on).

C) A Return

The stack is untouched and the heap area of the returning greenlet is freed by the python garbage collector.

Basically this is it, many more details and explanations can be found at (http://www.stackless.com/pipermail/stackless-dev/2004-March/000022.html) or just by reading the code as pointed in Alex's answer.

iMom0
  • 12,493
  • 3
  • 49
  • 61
Rabih Kodeih
  • 9,361
  • 11
  • 47
  • 55
30

If get and study the greenlet's sources, you'll see at the top of greenlet.c a long comment that starts at line 16 with the following summary...:

A PyGreenlet is a range of C stack addresses that must be saved and restored in such a way that the full range of the stack contains valid data when we switch to it.

and continues to line 82, summarizing exactly what you're asking about. Have you studies these lines (and the following 1000+ implementing them;-)...? I don't see a way to further squeeze these 66 lines down while still making sense, nor any added value in copying and pasting them here.

Basically, you'll see there is no real "hooking" to speak of (the C level stack is switched back and forth "under the interpreter's nose", so to speak) except for the delicate interactions with thread state in multi-threaded code, and the saving and restoring of a greenlet's state from/to the stack is based on memcpy calls plus some calls to the Python memory manager to allocate/reallocate and free space coming from, or going back to, the stack. The three functions in line 227-295 handle the grunt work, and they're wrapped in a couple C macros at 298-310 "in order to simplify maintenance", as the comment there says.

The interface through which other C extensions can interact with the greenlet extension is implemented at lines 956-1045, and exposed through the "CObject API" (via greenlet.h, of course) documented here.

Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
  • 3
    That comment block is confusing to me, and doesn't really answer my questions. I was just hoping for a concise, high-level summary/answer. Thanks for the pointers anyway - hope they're useful to others (or myself when I find more time to source-dive). – Yang Jul 28 '10 at 02:04
  • @Yang, those 86 lines **are** a concise, high-level summary -- hitting most of the highlights of the 1410 lines of code that together make up the `.c` and `.h` files! "Stack slices are saved by memcpy to memory that's allocated and reallocated by the Python memory manager, and restored by memcpy back into the stack (then the Python memory is freed)" is even more concise and higher level (but I did already say all this in my answer!) but obviously missing some important details (since it's two lines of text, not 86;-). What magic text do you expect to fall in-between and make you happy?! – Alex Martelli Jul 28 '10 at 02:16
  • 8
    For starters: what's "greenlet stack data"? Is that just bookkeeping for the greenlet? Or does it actually include certain C stack frames? What's a greenlet's "correct place in the stack"? Why are there always two greenlet blocks/why's the older one on heap? What's "data unrelated to this greenlet" below "greenlet stack data"? Diff btwn "unrelated data" and "newer data"? Etc. It's a small amount of C, but I'm also busy and this is not at all related to my current work - just asking out of curiosity. The question just popped into my head. Again, happy to source-dive later once I find time. – Yang Jul 28 '10 at 02:48
  • 3
    @Yang, the stack data of a greenlet is everything that's on the stack due to code executing in the greenlet, thus it most certainly includes stack frames (not sure what you think distinguishes a "C" stack frame from one from another language? C, Assembly, Fortran, whatever, they're stack frames). The correct place is exactly where the stack data was originally (since pointers into it are always involved it could not be usefully reloaded elsewhere). The small greenlet block that's always on the stack keeps that place. And this obvious info exhausts the space available in a comment, so, 'bye. – Alex Martelli Jul 28 '10 at 04:09
  • Why is it enough to save and restore stack only as stack is only part of the state of a process? What happens to all objects on the heap? – Piotr Dobrogost Jun 26 '13 at 07:39
  • Is there a general "industry term" for this technique that one could google to get more info (theory and otherwise)? – Noob Saibot Oct 17 '14 at 18:12
  • Do you happen to know if it uses `setjmp/longjmp` or `setcontext.h` under the hood? Or it implements same behavior differently? More specifically, since it has to restore the entire state of the interpreter process (including but not limited to python stack), as far as I understand, how does it manage to read\write more tricky parts like processor register values without using external libraries like ones mentioned above or assembly? – Ben Usman Oct 03 '18 at 21:24