10

Sandboxing Python code is notoriously difficult due to the power of the reflection facilities built into the language. At a minimum one has to take away the import mechanism and most of the built-in functions and global variables, and even then there are holes ({}.__class__.__base__.__subclasses__(), for instance).

In both Python 2 and 3, the 'sys' module is built into the interpreter and preloaded before user code begins to execute (even in -S mode). If you can get a handle to the sys module, then you have access to the global list of loaded modules (sys.modules) which enables you to do all sorts of naughty things.

So, the question: Starting from an empty module, without using the import machinery at all (no import statement, no __import__, no imp library, etc), and also without using anything normally found in __builtins__ unless you can get a handle to it some other way, is it possible to acquire a reference to either sys or sys.modules? (Each points to the other.) Am interested in both 2.x and 3.x answers.

zwol
  • 135,547
  • 38
  • 252
  • 361
  • http://nedbatchelder.com/blog/201302/looking_for_python_3_builtins.html gives you access to `__builtins__`, giving you access to `__import__`, giving you access to `sys`. – Martijn Pieters Nov 23 '15 at 21:07
  • 1
    Arguably this is a dupe of [can you recover from reassigning \_\_builtins\_\_ in python?](http://stackoverflow.com/q/13307110) – Martijn Pieters Nov 23 '15 at 21:08
  • @MartijnPieters Argh, I meant to specify no `__import__`, no `imp`, etc. Will edit. – zwol Nov 23 '15 at 21:24
  • Without import, you need to find a `__globals__` with `sys` in it. Not much harder. – Martijn Pieters Nov 23 '15 at 21:26
  • But you can't just state *no `__import__`*. You don't have a choice about that, you can *recover* `__import__` by recovering `__builtins__`. – Martijn Pieters Nov 23 '15 at 21:28

1 Answers1

7

__builtins__ can usually be recovered, giving you a path back to __import__ and thus to any module.

For Python 3 this comment from eryksun works, for example:

>>> f = [t for t in ().__class__.__base__.__subclasses__() 
...      if t.__name__ == 'Sized'][0].__len__
>>> f.__globals__['__builtins__']['__import__']('sys')
<module 'sys' (built-in)>

In Python 2, you just look for a different object:

>>> f = [t for t in ().__class__.__base__.__subclasses__()
...      if t.__name__ == 'catch_warnings'][0].__exit__.__func__
>>> f.__globals__['__builtins__']['__import__']('sys')
<module 'sys' (built-in)>

Either method looks for subclasses of a built-in type you can create with literal syntax (here a tuple), then referencing a function object on that subclass. Function objects have a __globals__ dictionary reference, which will give you the __builtins__ object back.

Note that you can't just say no __import__ because it is part of __builtins__ anyway.

However, many of those __globals__ objects are bound to have sys present already. Searching for a sys module on Python 3, for example, gives me access to one in a flash:

>>> next(getattr(c, f).__globals__['sys']
...      for c in ().__class__.__base__.__subclasses__()
...      for f in dir(c)
...      if isinstance(getattr(c, f, None), type(lambda: None)) and
...         'sys' in getattr(c, f).__globals__)
<module 'sys' (built-in)>

The Python 2 version only need to unwrap the unbound methods you find on classes to get the same results:

>>> next(getattr(c, f).__func__.__globals__['sys']
...      for c in ().__class__.__base__.__subclasses__()
...      for f in dir(c)
...      if isinstance(getattr(c, f, None), type((lambda: 0).__get__(0))) and
...         'sys' in getattr(c, f).__func__.__globals__)
<module 'sys' (built-in)>
Community
  • 1
  • 1
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343