Let's consider this simple ANTL4 language grammar.
Lexer:
lexer grammar BiaLexer;
Lt : '<' ;
Gt : '>' ;
Identifier : [a-zA-Z] ([a-zA-Z1-9] | ':')* ;
LeftParen : '(' ;
RightParen : ')' ;
Comma : ',' ;
Whitespace : (' ' | '\n') -> skip ;
Parser:
parser grammar BiaParser;
options { tokenVocab = BiaLexer; }
typeExpression
: referredName=Identifier # typeReference ;
expression
: callee=expression callTypeVariableList LeftParen callArgumentList RightParen # callExpression
| left=expression operator=Lt right=expression # binaryOperation
| left=expression operator=Gt right=expression # binaryOperation
| referredName=Identifier # reference
| LeftParen expression RightParen # parenExpression ;
callTypeVariableList: Lt typeExpression (Comma typeExpression)* Gt ;
callArgumentList: (expression (Comma expression)*)? ;
So, basically, this language has only:
ordinary references, e.g.
a
type references, e.g.
A
comparisons, e.g.
a < b
orc > d
expressions wrapped in parenthesis, e.g.
(a)
and, finally, generic function calls: e.g.
f<A, B>(a, b)
orf<A>(a)
(similar to, let's say, Kotlin)
This grammar is ambiguous. A simple expression like f<A>(a)
can be interpreted as...
...a generic call: Call(calle = ref:f, typeArgs = TypeArgs(typeRef:A), args = Args(ref:a))
...or a chain of comparisons between a reference, another reference and an parenthesised expression: Binary(op = >, left = Binary(op = <, left = ref:f, right = ref:A), right = Paren(ref:a))
The actual parser generated by ANTLR does the second, i.e. comparison chain. If I comment-out the binary operation rules...
// | left=expression operator=Lt right=expression # binaryOperation
// | left=expression operator=Gt right=expression # binaryOperation
...then the result is, as expected by me, the generic call.
Please note that I've, on purpose, put the #callExpression
case on the top of the expression
rule, with an intention of declaring that it has higher precedence than the comparison cases below. I believed that that's how one declares case precedence in ANTLR, but obviously it doesn't work in this case.
Questions:
- why does ANTLR interpret
f<A>(a)
as a chain of comparisons? - how can I fix that, i.e. make the generic call have precedence over comparison chain?
If that matters, I can provide the code I've used to dump the AST to a pretty-string, but that's just a simple ANTLR visitor emitting a string. I've skipped it for readability.