13

I have been playing with the dis library to disassemble some Python source code, but I see that this does not recurse into functions or classes:

import dis

source_py = "test.py"

with open(source_py) as f_source:
    source_code = f_source.read()

byte_code = compile(source_code, source_py, "exec")
dis.dis(byte_code)

All I see are entries such as:

 54         456 LOAD_CONST              63 (<code object foo at 022C9458, file "test.py", line 54>)
            459 MAKE_FUNCTION            0
            462 STORE_NAME              20 (foo)

If the source file had a function foo(), I could obviously add something like the following to the sourcefile:

dis.dis(foo)

I cannot figure out how to do this without changing the source file and executing it. I would like to be able to extract the pertinent bytes from the compiled byte_code and pass them to dis.dis().

def sub_byte_code(byte_code, function_or_class_name):
    sub_byte_code = xxxxxx
    dis.dis(sub_byte_code)

I have considered wrapping the source code and executing dis.dis() as follows but I do not wish to execute the script:

source_code_dis = "import dis\n%s\ndis.dis(foo)\n" % (source_code)
exec(source_code_dis)

Is there perhaps a trick to calling it? e.g. dis.dis(byte_code, recurse=True)

Martin Evans
  • 45,791
  • 17
  • 81
  • 97

2 Answers2

22

Import the file as a module and call dis.dis() on that module.

import dis
import test

dis.dis(test)

You can also do this from the command-line:

python -m dis test.py

Quoting from the documentation for dis.dis:

For a module, it disassembles all functions.

Edit: As of python 3.7, dis.dis is recursive.

Roland Smith
  • 42,427
  • 3
  • 64
  • 94
  • Excellent, worked perfectly. I had spotted that in the docs but for some reason hadn't thought to use `import`. – Martin Evans Aug 13 '15 at 14:29
  • 1
    The first approach works, but the `python -m dis test.py` doesn't. It disassembles only the first level of the source code, without recursion. All functions appears as code objects, like `(`. My Python version is `3.6.7`. – MiniMax Apr 06 '19 at 16:39
  • With the in-code approach, top-level module code will not be disassembled because it has no representation in the `module` object. This means that you cannot see e.g. the content of code guarded by `if __name__ == '__main__':`. In particular, `dis.dis(this)` will give an empty result. – Karl Knechtel Jul 06 '22 at 23:02
11

Late answer but I would have been glad to find it when needed. If you want to fully disassemble a script with functions without importing it, you have to implement the sub_byte_code function mentioned in the question. This is done by scanning byte_code.co_consts to find types.CodeType literals.

The following completes the script from the question:

import dis
import types

source_py = "test.py"

with open(source_py) as f_source:
    source_code = f_source.read()

byte_code = compile(source_code, source_py, "exec")
dis.dis(byte_code)

for x in byte_code.co_consts:
    if isinstance(x, types.CodeType):
        sub_byte_code = x
        func_name = sub_byte_code.co_name
        print('\nDisassembly of %s:' % func_name)
        dis.dis(sub_byte_code)

And the result will be something like that:

  1           0 LOAD_CONST               0 (<code object foo at 0x02CB99C0, file "test.py", line 1>)
              2 LOAD_CONST               1 ('foo')
              4 MAKE_FUNCTION            0
              6 STORE_NAME               0 (foo)

  4           8 LOAD_NAME                0 (foo)
             10 LOAD_CONST               2 (42)
             12 CALL_FUNCTION            1
             14 STORE_NAME               1 (x)
             16 LOAD_CONST               3 (None)
             18 RETURN_VALUE

Disassembly of foo:
  2           0 LOAD_FAST                0 (n)
              2 UNARY_NEGATIVE
              4 RETURN_VALUE

Edit: starting from python 3.7, dis.dis disassembles functions and does this recursively. dis.dis has a depth additional argument to control the depth of function definitions to be disassembled.

Gilles Arcas
  • 2,692
  • 2
  • 18
  • 13
  • Actually it is the only valid answer! Importing would loose all the global module code that disappears when you use the imported module. – Christian Tismer Jul 11 '20 at 12:26