ANTLR4 JavaScript parser: how to catch an error in parsing

Question

I have a grammar in ANTLR4 around which I am writing an application. A snippet of the pertinent grammar is shown below:

grammar SomeGrammar;
// ... a bunch of other parse rules
operand
   : id | literal ;
id
   : ID ;
literal
   : LITERAL ;
// A bunch of other lexer rules
LITERAL       : NUMBER | BOOLEAN | STRING;
NUMBER        : INTEGER | FLOAT ;
INTEGER       : [0-9]+ ;
FLOAT         : INTEGER '.' INTEGER | '.' INTEGER ;
BOOLEAN       : 'TRUE' | 'FALSE' ;
ID            : [A-Za-z]+[A-Za-z0-9_]* ;
STRING        : '"' .*? '"' ;

I generate the antlr4 JavaScript Lexer and Parser like so:

$ antlr4 -o . -Dlanguage=JavaScript -listener -visitor

and then I overload the exitLiteral () prototype to check if an operand is a literal. The issue is that if I pass

it (force) parses it to a literal, and throws an error (e.g. below shown with grun):

$ grun YARL literal -gui -tree
a
line 1:0 mismatched input 'a' expecting LITERAL
(literal a)

The same error when I use the JavaScript Parser which I overloaded like so:

SomeGrammarLiteralPrinter.prototype.exitLiteral = function (ctx) {
    debug ("Literal is " + ctx.getText ()); // Literal is a
    };

I would like to catch the error so that I can decide that it is an ID, and not a LITERAL. How do I do that?

Any help is appreciated.

@sepp2k: I specifically *need* to know if it is a literal or operand, but I get your point. I think I could check with the same lexing rules in my application and use the `operand` like you said, but I was wondering if there is an antlr4 parser way. — Sonny, May 05 '18 at 12:32
I don't have much experience with ANTLR4, but you'll know that based on which listener/visitor method will be called, no? — sepp2k, May 05 '18 at 12:34
@sepp2k, I think you are right about that too. I need to go back to the drawing board to understand how I am using the listeners. Thanks again! — Sonny, May 05 '18 at 12:45

GRosenberg · Accepted Answer · 2018-05-05T18:05:10.707

Better solution is to adjust the grammar so that it accurately describes the intended syntax to begin with:

startRule : ruleA ruleB EOF ;
ruleA     : something operand anotherthing ;
ruleB     : id assign literal  ;

operand   : ID | LITERAL ;
id        : ID ;
literal   : LITERAL ;

The parser performs a top-down graph evaluation of the parser rules, starting with the startRule. That is, the parser will evaluate the listed startRule elements in order, sequentially descending through the named sub-rules (and just those sub-rules). Consequently, ruleA will not encounter/consider the id and literal rules.

In this limited example then, there is no conflict in the seemingly overlapping definition of the operand, id, and literal rules.

Update

The OperandContext class will contain ID() and LITERAL() methods returning TerminalNode. The one that does not return null represents the symbol that was actually matched in that specific context. Look at the generated code.

Thanks for saving my hide again, GRosenberg. Like I mentioned in the comment on the OP to sepp2k, I think I need to go back to the drawing board wrt how I use the listeners and parser. It would help my cause tremendously if the main rule's parse tree also gave me the specific type of token it matched as I walked it, but I cannot seem to get that to work. — Sonny, May 05 '18 at 12:47
Appreciate your update on this, GRosenberg. Exactly what I was looking for. Thank you again! — Sonny, May 09 '18 at 17:30

ANTLR4 JavaScript parser: how to catch an error in parsing

1 Answers1