2

The following grammar illustrates the issue:

// test Antlr4 left recursion associativity
grammar LRA;
@parser::members {
    public static void main(String[] ignored) throws Exception{
        final LRALexer lexer = new LRALexer(new ANTLRInputStream(System.in));
        final LRAParser parser = new LRAParser(new CommonTokenStream(lexer));
        parser.setTrace(true);
        parser.file();
    }
}
ID: [A-Za-z_] ([A-Za-z_]|[0-9])*;
CMA: ',';
SMC: ';';
UNK: . -> skip;
file: punctuated EOF;
punctuated
    : punctuated cma punctuated
    | punctuated smc punctuated
    | expression
    ;
cma: CMA;
smc: SMC;
expression: id;
id: ID;

Given input "a,b,c" i get listener event trace output

( 'a' ) ( ',' ( 'b' ) ( ',' ( 'c' ) ) )

where ( represents enter punctuated, ) represents exit punctuated, and all other rules are omitted for brevity and clarity.

By inspection, this order of listener events represents a right-associative parse.

Common practice, and The Definitive Antlr 4 Reference, lead me to expect a left-associative parse, corresponding to the following listener event trace

( 'a' ) ( ',' ( 'b' ) ) ( ',' ( 'c' ) )

Is there something wrong with my grammar, my expectations, my interpretation of the listener events, or something else?

  • I haven't had this particular problem, but there's an [`assoc`](http://www.antlr.org/wiki/display/ANTLR4/Options#Options-RuleElementOptions) token parameter you could experiment with, and you could try moving `expression` to be the first option in `punctuated` and see whether that makes any difference. – Brad Mace Aug 07 '13 at 21:44
  • I have discovered that what works is hoisting the tokens directly into the left recursive rule. punctuated: punctuated CMA punctuated | punctuated SMC punctuated | expression; – user2651435 Aug 08 '13 at 03:57

1 Answers1

0

I would consider the workaround described above to be an adequate answer. The generated parser needs to pass a precedence parameter to a recursive call, and since the precedence is associated with a token, the token has to be directly available in the recursive rule so Antlr can find its precedence.

The working grammar looks like this:

// test Antlr4 left recursion associativity
grammar LRA;
@parser::members {
    public static void main(String[] ignored) throws Exception{
        final LRALexer lexer = new LRALexer(new ANTLRInputStream(System.in));
        final LRAParser parser = new LRAParser(new CommonTokenStream(lexer));
        parser.setTrace(true);
        parser.file();
    }
}
ID: [A-Za-z_] ([A-Za-z_]|[0-9])*;
CMA: ',';
SMC: ';';
UNK: . -> skip;
file: punctuated EOF;
punctuated
    : punctuated CMA punctuated
    | punctuated SMC punctuated
    | expression
    ;
expression: id;
id: ID;