18

I'm unclear on why the sub-interpreter API exists and why it's used in modules such as the mod_wsgi apache module. Is it mainly used for creating a security sandbox for different applications running within the same process, or is it a way to allow concurrency with multiple threads? Maybe both? Are there other purposes?

James Whetstone
  • 341
  • 1
  • 3
  • 10

2 Answers2

13

I imagine the purpose is to create separate python execution environments. For instance, mod_wsgi (Apache Python module) hosts a single python interpreter and then hosts multiple applications within sub-interpreters (in the default configuration).

Some key points from the documentation:

  • This is an (almost) totally separate environment for the execution of Python code. In particular, the new interpreter has separate, independent versions of all imported modules, including the fundamental modules __builtin__, __main__ and sys.
  • The table of loaded modules (sys.modules) and the module search path (sys.path) are also separate.
  • Because sub-interpreters (and the main interpreter) are part of the same process, the insulation between them isn’t perfect — for example, using low-level file operations like os.close() they can (accidentally or maliciously) affect each other’s open files.
  • Because of the way extensions are shared between (sub-)interpreters, some extensions may not work properly; this is especially likely when the extension makes use of (static) global variables, or when the extension manipulates its module’s dictionary after its initialization.
codeape
  • 97,830
  • 24
  • 159
  • 188
  • Does this mean that different interpreters can run concurrently in different threads? I'm still unclear on whether or not different interpreters in the same process share the same GIL. – James Whetstone Apr 16 '09 at 15:05
  • 3
    The GIL is a global object for the process, and is shared among the sub-interpreters. So no, they cannot run concurrently. http://objectmix.com/python/377035-multiple-independent-python-interpreters-c-c-program.html – codeape Apr 17 '09 at 11:42
  • Thanks for the link! I've been trying to figure out whether there's any way around the threading limitations of python and the GIL, and I'm not coming up with anything. – James Whetstone Apr 17 '09 at 20:14
  • 5
    The GIL only comes into play for execution of actual Python code. So, if you are using C extension modules which are able to do work without holding the GIL, you can still get some measure of concurrency. Some C extension modules deliberately partition their data so they can do this and thus get benefits of multi cpu/core systems. – Graham Dumpleton Jun 25 '09 at 03:25
  • @GrahamDumpleton, I imagine that can also cause trouble if the modules unknowingly share states? – Prof. Falken Dec 07 '12 at 10:54
  • 1
    @AmigableClarkKant Does indeed occur. The psycopg2 was broken there for a while from memory as didn't partition data for different sub interpreters properly. Thus could only be used in one sub interpreter at a time. – Graham Dumpleton Dec 08 '12 at 01:08
  • I believe the stdlib json module also has issues (at least it did some versions back). I experienced some weird issues that went away when I disabled sub-interpreter use in mod_wsgi. – codeape Dec 09 '12 at 19:49
  • Is any of this unsafe with database operations? Is it better to use fastcgi? – johnny Apr 15 '15 at 16:04
0

As I understood it last, the idea was to be able to execute multiple applications as well as multiple copies of the same application within the same process.

This is a feature found in other scripting languages (e.g. TCL), and is of particular use to gui builders, web servers, etc.

It breaks in python because many extensions are not multiple-interpreter safe, so one interpreter's actions could affect the variables in another interpreter.

gbronner
  • 1,907
  • 24
  • 39