3

I'm looking for a way to use importlib in Python 2.x to rewrite bytecode of imported modules on-the-fly. In other words, I need to hook my own function between the compilation and execution step during import. Besides that I want the import function to work just as the built-in one.

I've already did that with imputil, but that library doesn't cover all cases and is deprecated anyway.

Michał Kwiatkowski
  • 9,196
  • 2
  • 25
  • 20
  • "I need to hook my own function between the compilation and execution step" Isn't this what decorators are for? Are you monkey patching? What is the point of this? – S.Lott Sep 22 '10 at 14:08
  • 1
    I am sure that, given that Michał has already implemented it in imputil, he has a point for it. – Muhammad Alkarouri Sep 22 '10 at 15:26
  • 3
    I'm an author of Pythoscope, a unit test generator. In order to intercept events of a running application I need to do some bytecode manipulation first. You can see my implementation of this using imputil here: http://bazaar.launchpad.net/~pythoscope-developers/pythoscope/trunk/annotate/head:/bytecode_tracer/code_rewriting_importer.py That fails in some cases (e.g. it doesn't work when pickling is involved), so I need a more bulletproof solution. importlib AFAIK is regarded as the new standard for import hooks, but I don't know much about it, thus my question. – Michał Kwiatkowski Sep 22 '10 at 16:43

1 Answers1

2

Having had a look through the importlib source code, I believe you could subclass PyLoader in the _bootstrap module and override get_code:

class PyLoader:
    ...

    def get_code(self, fullname):
    """Get a code object from source."""
    source_path = self.source_path(fullname)
    if source_path is None:
        message = "a source path must exist to load {0}".format(fullname)
        raise ImportError(message)
    source = self.get_data(source_path)
    # Convert to universal newlines.
    line_endings = b'\n'
    for index, c in enumerate(source):
        if c == ord(b'\n'):
            break
        elif c == ord(b'\r'):
            line_endings = b'\r'
            try:
                if source[index+1] == ord(b'\n'):
                    line_endings += b'\n'
            except IndexError:
                pass
            break
    if line_endings != b'\n':
        source = source.replace(line_endings, b'\n')

    # modified here
    code = compile(source, source_path, 'exec', dont_inherit=True)
    return rewrite_code(code)

I assume you know what you're doing, but on behalf of programmers everywhere I believe I should say: ugh =p

Katriel
  • 120,462
  • 19
  • 136
  • 170
  • 1
    Yes, I know what I'm doing, thanks. ;) Anyway, only now I've noticed that the importlib backport only backports a single function, not the whole library. Ehh, back to square one... – Michał Kwiatkowski Sep 23 '10 at 12:29