I am attempting to understand how ANTLR4 handles errors in a Python environment. My final code needs to detect and report any data in the file that is not valid regardless of where it appears. As part of this effort I am using the examples in the py3antlr4book to try some basic scenarios. Specifically, I used the example in the 01-Hello directory and tried two different input files with bogus entries added:
Hello.g4
grammar Hello; // Define a grammar called Hello
r : 'hello' ID ; // match keyword hello followed by an identifier
ID : [a-z]+ ; // match lower-case identifiers
WS : [ \t\r\n]+ -> skip ; // skip spaces, tabs, newlines, \r (Windows)
bogus_first.txt
bogus
hello world
Output
line 1:0 extraneous input 'bogus' expecting 'hello'
(r bogus hello world)
bogus_last.txt
hello world
bogus
Output
(r hello world)
The output from bogus_first.txt makes a lot of sense to me. It errored, and it indicated where the error is. The output from bogus_last.txt didn't error and didn't indicate there was some sort of bad input in the data. This is surprising to me at least. I tried using this article's suggestion of adding an ErrorListener, but that didn't seem to catch the bogus entry. I also tried adding an ErrorStrategy, but that didn't seem to catch the bogus entry either.
Below is the code I used to implement the ErrorListener and ErrorStrategy. The inErrorRecoveryMode didn't seem to be at the line I wanted, but I am not really sure if I am just printing out the correct data or not.
What do I need to change about my testbench in order to be able to error on something like the example bogus_last.txt?
test_hello.py
import sys
from antlr4 import *
from HelloLexer import HelloLexer
from HelloParser import HelloParser
from antlr4.error.ErrorListener import ErrorListener
from antlr4.error.ErrorStrategy import DefaultErrorStrategy
class MyErrorListener( ErrorListener ):
def __init__(self):
super().__init__()
def syntaxError(self, recognizer, offendingSymbol, line, column, msg, e):
raise Exception("Oh no!!")
def reportAmbiguity(self, recognizer, dfa, startIndex, stopIndex, exact, ambigAlts, configs):
raise Exception("Oh no!!")
def reportAttemptingFullContext(self, recognizer, dfa, startIndex, stopIndex, conflictingAlts, configs):
raise Exception("Oh no!!")
def reportContextSensitivity(self, recognizer, dfa, startIndex, stopIndex, prediction, configs):
raise Exception("Oh no!!")
class MyErrorStrategy(DefaultErrorStrategy):
def __init__(self):
super().__init__()
def reset(self, parser):
raise Exception("Oh no!!")
def recoverInline(self, parser):
raise Exception("Oh no!!")
def recover(self, parser, excp):
raise Exception("Oh no!!")
def sync(self, parser):
raise Exception("Oh no!!")
def inErrorRecoveryMode(self, parser):
ctx = parser._ctx
print(self.lastErrorIndex)
return super().inErrorRecoveryMode(parser)
def reportError(self, parser, excp):
raise Exception("Oh no!!")
def main(argv):
input = FileStream(argv[1])
lexer = HelloLexer(input)
stream = CommonTokenStream(lexer)
parser = HelloParser(stream)
parser.addErrorListener( MyErrorListener() )
parser._errHandler = MyErrorStrategy()
tree = parser.r()
print(tree.toStringTree(recog=parser))
if __name__ == '__main__':
main(sys.argv)