5

I'd like to embed pylint in a program. The user enters python programs (in Qt, in a QTextEdit, although not relevant) and in the background I call pylint to check the text he enters. Finally, I print the errors in a message box.

There are thus two questions: First, how can I do this without writing the entered text to a temporary file and giving it to pylint ? I suppose at some point pylint (or astroid) handles a stream and not a file anymore.

And, more importantly, is it a good idea ? Would it cause problems for imports or other stuffs ? Intuitively I would say no since it seems to spawn a new process (with epylint) but I'm no python expert so I'm really not sure. And if I use this to launch pylint, is it okay too ?

Edit: I tried tinkering with pylint's internals, event fought with it, but finally have been stuck at some point.

Here is the code so far:

from astroid.builder import AstroidBuilder
from astroid.exceptions import AstroidBuildingException
from logilab.common.interface import implements
from pylint.interfaces import IRawChecker, ITokenChecker, IAstroidChecker
from pylint.lint import PyLinter
from pylint.reporters.text import TextReporter
from pylint.utils import PyLintASTWalker

class Validator():
    def __init__(self):
        self._messagesBuffer = InMemoryMessagesBuffer()
        self._validator = None
        self.initValidator()

    def initValidator(self):
        self._validator = StringPyLinter(reporter=TextReporter(output=self._messagesBuffer))
        self._validator.load_default_plugins()
        self._validator.disable('W0704')
        self._validator.disable('I0020')
        self._validator.disable('I0021')
        self._validator.prepare_import_path([])

    def destroyValidator(self):
        self._validator.cleanup_import_path()

    def check(self, string):
        return self._validator.check(string)


class InMemoryMessagesBuffer():
    def __init__(self):
        self.content = []
    def write(self, st):
        self.content.append(st)
    def messages(self):
        return self.content
    def reset(self):
        self.content = []

class StringPyLinter(PyLinter):
    """Does what PyLinter does but sets checkers once
    and redefines get_astroid to call build_string"""
    def __init__(self, options=(), reporter=None, option_groups=(), pylintrc=None):
        super(StringPyLinter, self).__init__(options, reporter, option_groups, pylintrc)
        self._walker = None
        self._used_checkers = None
        self._tokencheckers = None
        self._rawcheckers = None
        self.initCheckers()

    def __del__(self):
        self.destroyCheckers()

    def initCheckers(self):
        self._walker = PyLintASTWalker(self)
        self._used_checkers = self.prepare_checkers()
        self._tokencheckers = [c for c in self._used_checkers if implements(c, ITokenChecker)
                               and c is not self]
        self._rawcheckers = [c for c in self._used_checkers if implements(c, IRawChecker)]
        # notify global begin
        for checker in self._used_checkers:
            checker.open()
            if implements(checker, IAstroidChecker):
                self._walker.add_checker(checker)

    def destroyCheckers(self):
        self._used_checkers.reverse()
        for checker in self._used_checkers:
            checker.close()

    def check(self, string):
        modname = "in_memory"
        self.set_current_module(modname)

        astroid = self.get_astroid(string, modname)
        self.check_astroid_module(astroid, self._walker, self._rawcheckers, self._tokencheckers)

        self._add_suppression_messages()
        self.set_current_module('')
        self.stats['statement'] = self._walker.nbstatements

    def get_astroid(self, string, modname):
        """return an astroid representation for a module"""
        try:
            return AstroidBuilder().string_build(string, modname)
        except SyntaxError as ex:
            self.add_message('E0001', line=ex.lineno, args=ex.msg)
        except AstroidBuildingException as ex:
            self.add_message('F0010', args=ex)
        except Exception as ex:
            import traceback
            traceback.print_exc()
            self.add_message('F0002', args=(ex.__class__, ex))


if __name__ == '__main__':
    code = """
    a = 1
    print(a)
    """

    validator = Validator()
    print(validator.check(code))

The traceback is the following:

Traceback (most recent call last):
  File "validator.py", line 16, in <module>
    main()
  File "validator.py", line 13, in main
    print(validator.check(code))
  File "validator.py", line 30, in check
    self._validator.check(string)
  File "validator.py", line 79, in check
    self.check_astroid_module(astroid, self._walker, self._rawcheckers, self._tokencheckers)
  File "c:\Python33\lib\site-packages\pylint\lint.py", line 659, in check_astroid_module
    tokens = tokenize_module(astroid)
  File "c:\Python33\lib\site-packages\pylint\utils.py", line 103, in tokenize_module
    print(module.file_stream)
AttributeError: 'NoneType' object has no attribute 'file_stream'
# And sometimes this is added :
  File "c:\Python33\lib\site-packages\astroid\scoped_nodes.py", line 251, in file_stream
    return open(self.file, 'rb')
OSError: [Errno 22] Invalid argument: '<?>'

I'll continue digging tomorrow. :)

Community
  • 1
  • 1
ibizaman
  • 3,053
  • 1
  • 23
  • 34

2 Answers2

2

I got it running.

the first one (NoneType …) is really easy and a bug in your code:

Encountering an exception can make get_astroid “fail”, i.e. send one syntax error message and return!

But for the secong one… such bullshit in pylint’s/logilab’s API… Let me explain: Your astroid object here is of type astroid.scoped_nodes.Module.

It’s also created by a factory, AstroidBuilder, which sets astroid.file = '<?>'.

Unfortunately, the Module class has following property:

@property
def file_stream(self):
    if self.file is not None:
        return open(self.file, 'rb')
    return None

And there’s no way to skip that except for subclassing (Which would render us unable to use the magic in AstroidBuilder), so… monkey patching!

We replace the ill-defined property with one that checks an instance for a reference to our code bytes (e.g. astroid._file_bytes) before engaging in above default behavior.

def _monkeypatch_module(module_class):
    if module_class.file_stream.fget.__name__ == 'file_stream_patched':
        return  # only patch if patch isn’t already applied

    old_file_stream_fget = module_class.file_stream.fget
    def file_stream_patched(self):
        if hasattr(self, '_file_bytes'):
            return BytesIO(self._file_bytes)
        return old_file_stream_fget(self)

    module_class.file_stream = property(file_stream_patched)

That monkeypatching can be called just before calling check_astroid_module. But one more thing has to be done. See, there’s more implicit behavior: Some checkers expect and use astroid’s file_encoding field. So we now have this code in the middle of check:

astroid = self.get_astroid(string, modname)
if astroid is not None:
    _monkeypatch_module(astroid.__class__)
    astroid._file_bytes = string.encode('utf-8')
    astroid.file_encoding = 'utf-8'

    self.check_astroid_module(astroid, self._walker, self._rawcheckers, self._tokencheckers)

One could say that no amount of linting creates actually good code. Unfortunately pylint unites enormous complexity with a specialization of calling it on files. Really good code has a nice native API and wraps that with a CLI interface. Don’t ask me why file_stream exists if internally, Module gets built from but forgets the source code.

PS: i had to change sth else in your code: load_default_plugins has to come before some other stuff (maybe prepare_checkers, maybe sth. else)

PPS: i suggest subclassing BaseReporter and using that instead of your InMemoryMessagesBuffer

PPPS: this just got pulled (3.2014), and will fix this: https://bitbucket.org/logilab/astroid/pull-request/15/astroidbuilderstring_build-was/diff

4PS: this is now in the official version, so no monkey patching required: astroid.scoped_nodes.Module now has a file_bytes property (without leading underscore).

flying sheep
  • 8,475
  • 5
  • 56
  • 73
  • This is awesome. I didn't test what you suggested though. Did you manage to use it without the pull request or is it mandatory? I would be grateful if you can post a gist of what you needed or suggest as changes. (To be honest this request is 50% laziness, 50% confusion) :) – ibizaman Nov 25 '13 at 15:13
  • 1
    the monkeypatch in concert with explicit setting of the `file_encoding` and `_file_bytes` fields replaces the pull request, yeah. and it won’t break the code once the pull request is pulled. so just use it now, and after the request is pulled and a new astroid version is out, either remove the three lines after `if astroid is not None` or forget to do it :) and as said: the addition of `if astroid is not None` fixes a bug in *your* code. – flying sheep Nov 25 '13 at 16:26
1

Working with an unlocatable stream may definitly cause problems in case of relative imports, since the location is then needed to find the actually imported module.

Astroid support building an AST from a stream, but this is not used/exposed through Pylint which is a level higher and designed to work with files. So while you may acheive this it will need a bit of digging into the low-level APIs.

The easiest way is definitly to save the buffer to the file then to use the SA answer to start pylint programmatically if you wish (totally forgot this other account of mine found in other responses ;). Another option being to write a custom reporter to gain more control.

sthenault
  • 14,397
  • 5
  • 38
  • 32
  • So if I stay away from relative imports everything will be fine ? If yes, it seems quite easy to dive into astroid's low-level API. At least I'll try that first before resorting to temporary files. – ibizaman Oct 24 '13 at 12:15
  • @ibizaman: any progress? i’d like to do the exact same and abhor termpfiles. – flying sheep Nov 23 '13 at 19:54