1

Problem Description

I am curious if it is possible to exec a string within a function as if the string were substituted for exec directly (with appropriate indentation). I understand that in 99.9% of cases, you shouldn't be using exec but I'm more interested in if this can be done rather than if it should be done.

The behavior I want is equivalent to:

GLOBAL_CONSTANT = 1

def test_func():
    def A():
        return GLOBAL_CONSTANT
    def B():
        return A()
    return B

func = test_func()
assert func() == 1

But I am given instead:

GLOBAL_CONSTANT = 1

EXEC_STR = """
def A():
    return GLOBAL_CONSTANT
def B():
    return A()
"""

def exec_and_extract(exec_str, var_name):
    # Insert code here

func = exec_and_extract(EXEC_STR, 'B')
assert func() == 1

Failed Attempts

def exec_and_extract(exec_str, var_name):
    exec(EXEC_STR)  # equivalent to exec(EXEC_STR, globals(), locals())
    return locals()[var_name]

NameError: name 'A' is not defined when calling func() since A and B exist inside exec_and_extract's locals() but the execution context while running A or B is exec_and_extract's globals().


def exec_and_extract(exec_str, var_name):
    exec(EXEC_STR, locals())  # equivalent to exec(EXEC_STR, locals(), locals())
    return locals()[var_name]

NameError: name 'GLOBAL_CONSTANT' is not defined when calling A from inside func() since the execution context of A is exec_and_extract's locals() which does not contain GLOBAL_CONSTANT.


def exec_and_extract(exec_str, var_name):
    exec(EXEC_STR, globals())  # equivalent to exec(EXEC_STR, globals(), globals())
    return globals()[var_name]

Works but pollutes global namespace, not equivalent.


def exec_and_extract(exec_str, var_name):
    locals().update(globals())
    exec(EXEC_STR, locals())  # equivalent to exec(EXEC_STR, locals(), locals())
    return locals()[var_name]

Works but requires copying the entire content of exec_and_extract's globals() into its locals() which is a waste of time if globals() is large (of course not applicable in this contrived example). Additionally, is subtly not the same as the "paste in code" version since if one of the arguments to exec_and_extract happened to be GLOBAL_CONSTANT (a terrible argument name), the behavior would be different ("paste in" version would use the argument value while this code would use the global constant value).

Further Constraints

Trying to cover any "loopholes" in the problem statement:

  • The exec_str value should represent arbitrary code that can access global or local scope variables.
  • Solution should not require analysis of what global scope variables are accessed within exec_str.
  • There should be no "pollution" between subsequent calls to exec_and_extract (in global namespace or otherwise). i.e. In this example, execution of EXEC_STR should not leave A around to be referenceable in future calls to exec_and_extract.
nj3
  • 43
  • 3
  • 1
    "which is a waste of time if globals() is large " Not really, no. I mean, you realize, none of the objects are actually copied. – juanpa.arrivillaga Apr 24 '20 at 05:44
  • @juanpa.arrivillaga Yes of course, only copies values for primitives and object "references" (not sure what the correct term is) for everything else. This may not be as expensive as I thought but also does have the issue that `user2357112 supports Monica` pointed out below where it takes a snapshot of `globals()` and does not see any future updates to those variables. – nj3 Apr 24 '20 at 05:46
  • 1
    No, it doesn't copy values for "primitives", python doesn't have "primitives", *everything is an object, and everything uses reference semantics here*. But yes, the globals snapshot problem does persist. I'm just saying, it is definitely not an expensive operation. At most, a millisecond? And that's for probably an unusually large `globals()` otherwise you are on the order of hundreds of nanoseconds on any modern machine. – juanpa.arrivillaga Apr 24 '20 at 05:48
  • Thanks for clarifying on python copy semantics. – nj3 Apr 24 '20 at 05:54

3 Answers3

2

This is impossible. exec interacts badly with local variable scope mechanics, and it is far too restricted for anything like this to work. In fact, literally any local variable binding operation in the executed string is undefined behavior, including plain assignment, function definitions, class definitions, imports, and more, if you call exec with the default locals. Quoting the docs:

The default locals act as described for function locals() below: modifications to the default locals dictionary should not be attempted. Pass an explicit locals dictionary if you need to see effects of the code on locals after function exec() returns.

Additionally, code executed by exec cannot return, break, yield, or perform other control flow on behalf of the caller. It can break loops that are part of the executed code, or return from functions defined in the executed code, but it cannot interact with its caller's control flow.


If you're willing to sacrifice the requirement to be able to interact with the calling function's locals (as you mentioned in the comments), and you don't care about interacting with the caller's control flow, then you could insert the code's AST into the body of a new function definition and execute that:

import ast
import sys

def exec_and_extract(code_string, var):
    original_ast = ast.parse(code_string)
    new_ast = ast.parse('def f(): return ' + var)
    fdef = new_ast.body[0]
    fdef.body = original_ast.body + fdef.body
    code_obj = compile(new_ast, '<string>', 'exec')

    gvars = sys._getframe(1).f_globals
    lvars = {}
    exec(code_obj, gvars, lvars)

    return lvars['f']()

I've used an AST-based approach instead of string formatting to avoid problems like accidentally inserting extra indentation into triple-quoted strings in the input.

inspect lets us use the globals of whoever called exec_and_extract, rather than exec_and_extract's own globals, even if the caller is in a different module.

Functions defined in the executed code see the actual globals rather than a copy.

The extra wrapper function in the modified AST avoids some scope issues that would occur otherwise; particularly, B wouldn't be able to see A's definition in your example code otherwise.

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • Thanks for the quick reply! What if I relax the requirements so that it is not equivalent but is still runnable. In particular I do not require that the `exec`ed string access or modify the `exec_and_extract` function's `locals()` but I still want it to be able to access global scope variables and other variables defined in the `exec`ed string (like `B` accessing `A`). Therefore it is no longer a requirement to use the default `locals()`. – nj3 Apr 24 '20 at 05:34
  • @nj3: There are still conflicts. If you want the executed code to see the original global variable scope rather than a copy, you have to pass the actual globals() to exec rather than a copy. If you don't want anything defined in the executed code to pollute globals, you have to pass a (non-default) locals dict. That, in turn, means that functions defined in the executed code can only see globals, not locals (this is due to the interaction of exec and closure mechanics). Particularly, functions defined in the executed code can't call each other. – user2357112 Apr 24 '20 at 05:50
  • I see, so the conclusion is the exact semantics of the adjusted problem definition (see global variables and other variables defined within the `exec_str`) is not possible. As pointed out by the others, if ok with not seeing future changes to global scope variables it's possible to make the (non-default) local dict a copy of `globals()` (which should be fast) and pass that to `exec`. Thanks again for the help! – nj3 Apr 24 '20 at 05:56
  • @user2357112supportsMonica damn, I was thinking maybe something like passing `collections.ChainMap({}, globals())` so it wouldn't pollute the global namespace but still have read access to it would be a clever hack, but alas, the `globals` argument *must be a `dict`* – juanpa.arrivillaga Apr 24 '20 at 05:59
  • @juanpa.arrivillaga Haha, I was also considering some sort of chain map solution for the same reason but unfortunately not possible. – nj3 Apr 24 '20 at 06:00
  • One thing that could get you closer: have the function modify the executed string to place all the executed code in a new function body. Python would then be able to analyze that new function's locals, avoiding the last problem. – user2357112 Apr 24 '20 at 06:06
  • You can hack your way around the "chain map isn't a dict" issue by making a new class that inherits both: `class ChainMapDict(collections.ChainMap, dict): pass`. Obviously ChainMap was *not* designed with that sort of usage in mind, but after a bit of playing around with ChainMapDict, it seems to work the same as a regular ChainMap as far as I can tell, and `exec()` accepts it. – RoadrunnerWMC Apr 24 '20 at 06:32
  • @RoadrunnerWMC: The docs for `exec` say "If only globals is provided, it must be a dictionary (**and not a subclass of dictionary**), which will be used for both the global and the local variables.", so that may actually be even less reliable than it looks. – user2357112 Apr 24 '20 at 06:49
0

Works but pollutes global namespace, not equivalent.

Then how about making a copy of the globals() dict, and retrieving B from that?

def exec_and_extract(exec_str, var_name):
    env = dict(globals())
    env.update(locals())
    exec(EXEC_STR, env)
    return env[var_name]

This still works, and doesn't pollute the global namespace.

RoadrunnerWMC
  • 693
  • 4
  • 9
  • This gets closer to the (amended) goal, but still isn't equivalent. One problem is that functions defined in the executed code cannot see updates to the original global variables. – user2357112 Apr 24 '20 at 05:43
  • Thanks! See the last failed attempt. Essentially equivalent except copies globals into locals instead of a new dictionary (this solution is definitely better if I don't want to mess with `locals()` as the other commenter said). That said it has the same downside of requiring a copy of all global variables. Edit: also agreed with ^, very good point about changes to globals. Thanks! – nj3 Apr 24 '20 at 05:44
  • I edited my answer to address the "what if `GLOBAL_CONSTANT` was an argument name" concern. As far as the executed code being able to see changes to the global namespace (due to other threads or something)... yeah, I doubt that can be completely solved, unfortunately. – RoadrunnerWMC Apr 24 '20 at 05:55
  • Thanks for the feedback. Ya it seems like the problem is not solvable fully. Also global changes could just be changed after the call to `exec_and_extract` before executing the returned function, no need for concurrency. – nj3 Apr 24 '20 at 05:58
0

@user2357112supportsMonica (Responding to comment in thread since this contains code block)

Seems like something like this might work:

def exec_and_extract(exec_str, var_name):
    env = {}
    modified_exec_str = """def wrapper():
{body}
    return {var_name}
    """.format(body=textwrap.indent(exec_str, '    '), var_name=var_name)
    exec(modified_exec_str, globals(), env)
    return env['wrapper']()

This allows accessing of global scope including future changes as well as accessing of other variables defined inside the exec_str.

nj3
  • 43
  • 3
  • I would have gone for the `ast` module instead of string formatting, to avoid problems like accidentally inserting spurious indentation into triple-quoted strings in the input. (Actually, I'm writing up an `ast`-based approach right now.) – user2357112 Apr 24 '20 at 06:21