0

When there are no syntax errors:

>>> symtable.symtable("guido = 1; rossum = 2", "code", "exec").get_symbols()
[<symbol 'rossum'>, <symbol 'guido'>]

When there are syntax errors:

>>> symtable.symtable("guido = 1; for", "code", "exec").get_symbols()
guido = 1; for
             ^
SyntaxError: invalid syntax

Is there a way to make this fault-tolerant? In other words, in the second example, is it possible to generate a symtable that has the "guido" symbol in it even though there is a syntax error elsewhere in the code?

I've tried looking around for modules that will turn python code with syntax errors into code that compiles (e.g., by removing bad statements from them), but I wasn't able to find anything.

Billy
  • 185
  • 8
  • That's an impossible task; syntax exists to make statements unambiguous. Syntax errors make it impossible to know exactly what's a symbol and what is something else. – Martijn Pieters Mar 15 '14 at 17:16
  • Fair point. It still seems like it should be possible to use the code _before_ the syntax error, right? – Billy Mar 15 '14 at 17:18
  • Not really. What if the syntax error is a missing closing parenthesis somewhere? You have no idea where the working code ends and the broken starts at that point. – Martijn Pieters Mar 15 '14 at 17:19

1 Answers1

0

In a typical system, the parser runs before any semantics of the language is evaluated (e.g. via the AST). As such, syntax errors are detected long before symbols are recognized as symbols. So it’s simply not possible to have a symbol table because that’s only created after the whole code has been parsed.

If you want to “repair” the code somehow, you could still catch the SyntaxError and then change the code in some way to get a parseable code. For the above example, something like this could work—but I’m pretty sure it won’t work for more complicated stuff, so you would have to expand it.

def getSafeSymbols (code):
    while True:
        try:
            tbl = symtable.symtable(code, "code", "exec")
        except SyntaxError as e:
            index = min(e.offset, len(code)) - 1

            # keep decrementing index until whitespace or statement separator
            while index >= 0 and code[index] not in ' \t\n;':
                index -= 1

            code = code[:index].strip()
        else:
            return tbl.get_symbols()
>>> getSafeSymbols("guido = 1; rossum = 2")
[<symbol 'guido'>, <symbol 'rossum'>]
>>> getSafeSymbols("guido = 1; for")
[<symbol 'guido'>]
>>> getSafeSymbols("guido = foo; bar = baz")
[<symbol 'baz'>, <symbol 'foo'>, <symbol 'guido'>, <symbol 'bar'>]
>>> getSafeSymbols("guido = foo; bar = ")
[<symbol 'foo'>, <symbol 'guido'>, <symbol 'bar'>]
poke
  • 369,085
  • 72
  • 557
  • 602
  • I guess I was hoping that I could parse code, determine that there were syntax errors, ignore blocks of code corresponding to these syntax errors, and then create code that parses correctly. – Billy Mar 15 '14 at 17:30
  • Well, you could still put that into a `try` block, catch the `SyntaxError` and then make modifications to the code string and try again. – poke Mar 15 '14 at 17:32