0

I am attempting to implement a decorator that receives a function, parses it into an AST, eventually will do something to the AST, then reconstruct the original (or modified) function from the AST and return it. My current approach is, once I have the AST, compile it to a code <module> object, then get the constant in it with the name of the function, convert it to FunctionType, and return it. I have the following:

import ast, inspect, types

def as_ast(f):
    source = inspect.getsource(f)
    source = '\n'.join(source.splitlines()[1:]) # Remove as_ast decoration, pretend there can be no other decorations for now
    tree = ast.parse(source)
    print(ast.dump(tree, indent=4)) # Debugging log
    # I would modify the AST somehow here
    filename = f.__code__.co_filename
    code = compile(tree, filename, 'exec')
    func_code = next(
        filter(
            lambda x: isinstance(x, types.CodeType) and x.co_name == f.__name__,
            code.co_consts)) # Get function object
    func = types.FunctionType(func_code, {})
    return func

@as_ast
def test(arg: int=4):
    print(f'{arg=}')

Now, I would expect that calling test later in this source code will simply have the effect of calling test if the decorator were absent, which is what I observe, so long as I pass an argument for arg. However, if I pass no argument, instead of using the default I gave (4), it throws a TypeError for the missing argument. This makes it pretty clear that my approach for getting a callable function from the AST is not quite correct, as the default argument is not applied, and there may be other details that would slip through as it is now. How might I be able to correctly recreate the function from the AST? The way I currently go from the code module object to the function code object also seems... off intuitively, but I do not know how else one might achieve this.

user2649681
  • 750
  • 1
  • 6
  • 23
  • The easier way to get the actual function object would be to call `exec(code)` - the function will be stored with its own name in the global namespace. To avoid accidentally overwriting something of yours, you should probably pass a different namespace to `exec()` - so something like `d = {}` / `exec(code, d)` / `return d[f.__name__]` should work. – jasonharper Aug 27 '22 at 21:24

1 Answers1

0

The root node of the AST is a Module. Calling compile() on the AST, results in a code object for a module. Looking at the compiled code object returned using dis.dis(), from the standard library, shows the module level code builds the function and stores it in the global name space. So the easiest thing to do is exec the compiled code and then get the function from the 'global' environment of the exec call.

The AST node for the function includes a list of the decorators to be applied to the function. Any decorators that haven't been applied yet should be deleted from the list so they don't get applied twice (once when this decorator compiles the code, and once after this decorator returns). And delete this decorator from the list or you'll get an infinite recursion. The question is what to do with any decorators that came before this one. They have already run, but their result is tossed out because this decorator (as_ast) goes back to the source code. You can leave them in the list so they get rerun, or delete them if they don't matter.

In the code below, all the decorators are deleted from the parse tree, under the assumption that the as_ast decorator is applied first. The call to exec() uses a copy of globals() so the decorator has access to any other globally visible names (variables, functions, etc). See the docs for exec() for other considerations. Uncommented the print statements to see what is going on.

import ast
import dis
import inspect
import types

def as_ast(f):
    source = inspect.getsource(f)
    
    #print(f"=== source ===\n{source}")
    tree = ast.parse(source)
    
    #print(f"\n=== original ===\n{ast.dump(tree, indent=4)}")

    # Remove the decorators from the AST, because the modified function will 
    # be passed to them anyway and we don't want them to be called twice.
    for node in ast.walk(tree):
        if isinstance(node, ast.FunctionDef):
            node.decorator_list.clear()
     
    # Make modifications to the AST here

    #print(f"\n=== revised ===\n{ast.dump(tree, indent=4)}")
    
    name = f.__code__.co_name
    code = compile(tree, name, 'exec')
    
    #print("\n=== byte code ===")
    #dis.dis(code)
    #print()

    temp_globals = dict(globals())
    exec(code, temp_globals)
    
    return temp_globals[name]

Note: this decorator has not been tested much and has not been tested at all on methods or nested functions.

An interesting idea would be to for as_ast to return the AST. Then subsequent decorators could manipulate the AST. Lastly, a from_ast decorator could compile the modified AST into a function.

RootTwo
  • 4,288
  • 1
  • 11
  • 15