4

I want to disallow access to file system from clients code, so I think I could overwrite open function

env = {
   'open': lambda *a: StringIO("you can't use open")
}

exec(open('user_code.py'), env)

but I got this

unqualified exec is not allowed in function 'my function' it contains a 
nested function with free variables

I also try

 def open_exception(*a):
     raise Exception("you can't use open")
 env = {
     'open': open_exception
 }

but got the same Exception (not "you can't use open")

I want to prevent of:

executing this:

"""def foo():
     return open('some_file').read()
print foo()"""

and evaluate this

"open('some_file').write('some text')"

I also use session to store code that was evaluated previously so I need to prevent of executing this:

"""def foo(s):
   return open(s)"""

and then evaluating this

"foo('some').write('some text')"

I can't use regex because someone could use (eval inside string)

"eval(\"opxx('some file').write('some text')\".replace('xx', 'en')"

Is there any way to prevent access to file system inside exec/eval? (I need both)

jcubic
  • 61,973
  • 54
  • 229
  • 402

5 Answers5

10

There's no way to prevent access to the file system inside exec/eval. Here's an example code that demonstrates a way for the user code to call otherwise restricted classes that always works:

import subprocess
code = """[x for x in ().__class__.__bases__[0].__subclasses__() 
           if x.__name__ == 'Popen'][0](['ls', '-la']).wait()"""
# Executing the `code` will always run `ls`...
exec code in dict(__builtins__=None)

And don't think about filtering the input, especially with regex.

You might consider a few alternatives:

  1. ast.literal_eval if you could limit yourself only to simple expressions
  2. Using another language for user code. You might look at Lua or JavaScript - both are sometimes used to run unsafe code inside sandboxes.
  3. There's the pysandbox project, though I can't guarantee you that the sandboxed code is really safe. Python wasn't designed to be sandboxed, and in particular the CPython implementation wasn't written with sandboxing in mind. Even the author seems to doubt the possibility to implement such sandbox safely.
Rosh Oxymoron
  • 20,355
  • 6
  • 41
  • 43
  • Tcl also has a great sandbox - you can create whole nested interpreters that allow or disallow any or all commands. – Bryan Oakley Feb 24 '11 at 13:29
  • You code hopefully will not work because I use custom globals and filter it to don't allow to import subprocees, popen2, sys and os. BTW: your code i wicked. +1 – jcubic Feb 24 '11 at 15:51
  • Hopefully? You should probably test it :) Also, given the complexity of Python, there are likely many more exploits you haven't thought of. – Brian Goldman Feb 24 '11 at 15:58
  • @jcubic: You can never think of all possible ways to get out of the custom environment to avoid them. That's only one of them. It's enough that *another* module imports subprocess, and you're doomed. – Rosh Oxymoron Feb 24 '11 at 16:12
  • I check all functions from environments `if type(f) == types.BuiltinFunctionType && f.__module__ in modules:` (I don't know if this will work every time) but if people can access to filesystem then they can write cgi scripts (and they don't need to be written in python) and run them using a browser. – jcubic Feb 24 '11 at 17:10
  • 1
    Using code you suggested `().__class__.__bases__[0].__subclasses__()` I found `socket` `StringIO` and `file` the last one can be used to access files `[x for x in ().__class__.__bases__[0].__subclasses__() if x.__name__ == 'file'][ 0]("file.py", "w")` – jcubic Feb 25 '11 at 02:02
  • I feel like I should leave a note here for unwary posterity, that even removing the entire set of globals and modules from the execution context of the untrusted code is not sufficient, unless you're filtering the untrusted source so vigorously that using `ast.literal_eval` is a better option. Python is *extremely* difficult to sandbox effectively from within itself. – the paul Aug 31 '12 at 00:02
  • Pysandbox prevents access to the variables in your program, not the system. Is there a sort of reverse for that which prevents access to the system but allows access to variables inside the program? – trevorKirkby May 19 '14 at 18:22
5

This actually can be done.

That is, practically just what you describe can be accomplished on Linux, contrary to other answers here. That is, you can achieve a setup where you can have an exec-like call which runs untrusted code under security which is reasonably difficult to penetrate, and which allows output of the result. Untrusted code is not allowed to access the filesystem at all except for reading specifically allowed parts of the Python vm and standard library.

If that's close enough to what you wanted, read on.

I'm envisioning a system where your exec-like function spawns a subprocess under a very strict AppArmor profile, such as the one used by Straitjacket (see here and here). This will limit all filesystem access at the kernel level, other than files specifically allowed to be read. This will also limit the process's stack size, max data segment size, max resident set size, CPU time, the number of signals that can be queued, and the address space size. The process will have locked memory, cores, flock/fcntl locks, POSIX message queues, etc, wholly disallowed. If you want to allow using size-limited temporary files in a scratch area, you can mkstemp it and make it available to the subprocess, and allow writes there under certain conditions (make sure that hard links are absolutely disallowed). You'd want to make sure to clear out anything interesting from the subprocess environment and put it in a new session and process group, and close all FDs in the subprocess except for the stdin/stdout/stderr, if you want to allow communication with those.

If you want to be able to get a Python object back out from the untrusted code, you could wrap it in something which prints the result's repr to stdout, and after you check its size, you evaluate it with ast.literal_eval(). That pretty severely limits the possible types of object that can be returned, but really, anything more complicated than those basic types probably carries the possibility of sekrit maliciousness intended to be triggered within your process. Under no circumstances should you use pickle for the communication protocol between the processes.

the paul
  • 8,972
  • 1
  • 36
  • 53
5

You can't turn exec() and eval() into a safe sandbox. You can always get access to the builtin module, as long as the sys module is available::

sys.modules[().__class__.__bases__[0].__module__].open

And even if sys is unavailable, you can still get access to any new-style class defined in any imported module by basically the same way. This includes all the IO classes in io.

  • +1 For another wicked example of Python code, but the `sys` module is never loaded for user code. – jcubic Feb 24 '11 at 15:52
3

As @Brian suggest overriding open doesn't work:

def raise_exception(*a):
    raise Exception("you can't use open")

open = raise_exception

print eval("open('test.py').read()", {})

this display the content of the file but this (merging @Brian and @lunaryorn answers)

import sys
def raise_exception(*a):
    raise Exception("you can't use open")

__open = sys.modules['__builtin__'].open
sys.modules['__builtin__'].open = raise_exception

print eval("open('test.py').read()", {})

will throw this:

Traceback (most recent call last):
  File "./test.py", line 11, in <module>
    print eval("open('test.py').read()", {})
  File "<string>", line 1, in <module>
  File "./test.py", line 5, in raise_exception
    raise Exception("you can't use open")
Exception: you can't use open
Error in sys.excepthook:
Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/apport_python_hook.py", line 48, in apport_excepthook
    if not enabled():
  File "/usr/lib/python2.6/dist-packages/apport_python_hook.py", line 23, in enabled
    conf = open(CONFIG).read()
  File "./test.py", line 5, in raise_exception
    raise Exception("you can't use open")
Exception: you can't use open

Original exception was:
Traceback (most recent call last):
  File "./test.py", line 11, in <module>
    print eval("open('test.py').read()", {})
  File "<string>", line 1, in <module>
  File "./test.py", line 5, in raise_exception
    raise Exception("you can't use open")
Exception: you can't use open

and you can access to open outside user code via __open

jcubic
  • 61,973
  • 54
  • 229
  • 402
  • If you don't pass an alternative object for `__builtins__` in exec or eval, you would be able to apply my trick to locate the `file` class and use it to open a file, regardless of what and where you replace. If you use the code that you've given, you can access the original open as `open.func_globals['__open']` – Rosh Oxymoron Feb 24 '11 at 18:41
1

"Nested function" refers to the fact that it's declared inside another function, not that it's a lambda. Declare your open override at the top level of your module and it should work the way you want.

Also, I don't think this is totally safe. Preventing open is just one of the things you need to worry about if you want to sandbox Python.

Brian Goldman
  • 716
  • 3
  • 11