Your grammar is ambiguous:
"a" AND "b"
can be matched as
parse_starts_here
and_expression
and_thing
and_level2_thing
text
OP_AND
and_level2_thing
text
or as
parse_starts_here
and_expression
and_thing
and_level2_thing
text
OP_AND
and_thing
and_level2_thing
text
Normally, ANTLR would warn you about this ambiguity, but by declaring backtrack = true
, you effectively tell ANTLR to try all alternatives and use first that matches.
On unambiguous grammars ANTLR runs in linear time. Use of backtracking leads to potentially exponential time. memoize=true
is used to reduce the time back to linear at the cost of more memory used.
I would recommend removing the backtrack=true
option. ANTLR will then tell you where the grammar is ambiguous. You can either remove the ambiguity, or if it is not possible, use syntactic predicates only where needed to prefer one possible match to the other. memoize=true
would still help if you end up using syntactic predicates.
Edit - As to why there's backtracking even when both alternatives match:
It does not backtrack, but the time will still be exponential.
The problem is that ANTLR doesn't know it can match the first alternative until it actually tries to match it (since you didn't give it any hints). So it will first try to match the rule, and if it succeeds it will actually match it and perform all associated actions (memoize
option avoids exactly this by remembering the particular rule succeeded for the given input position and not repeating the whole matching process).
Example:
"a" AND ( "b" AND "c" )
To match this, ANTLR must:
- Match
"a"
- Decide whether the
AND
can be matched using the inner rule
- To do this, it tries to match the inner rule
AND
matches, (
means go to and_thing
- To match
and_thing
, it must:
- Match
(
and "b"
- Decide whether the
AND
can be matched using the inner rule
- To do this, it tries to match the inner rule against
AND "c"
- The predicate succeeds -
AND "c"
matches the inner rule
- Match the inner rule against
AND "c"
- Match
)
- The predicate succeeds -
AND ( "b" AND "c" )
matches the inner rule
- Match the inner rule against
AND ( "b" AND "c" )
AND
matches, (
means go to and_thing
- To match
and_thing
, it must:
- Match
(
and "b"
- Decide whether the
AND
can be matched using the inner rule
- To do this, it tries to match the inner rule against
AND "c"
- The predicate succeeds -
AND "c"
matches the inner rule
- Match the inner rule against
AND "c"
- Match
)
AS the emphasized parts of the process show, ANTLR needs to match the text AND "c"
four times to match the input, while there's one level of nesting. If there was another level, this whole process would repeat twice, so ANTLR would parse the last part eight times.
One related remark - if you use syntactic predicates rather than backtrack option, you can fine-tune what the predicate contains - in some cases it needs not contain the whole rule being predicated. In your example above, you could just tell ANLTR to use the OP_AND and_level2_thing
rule whenever it encounters OP_AND
, without needing to check whether and_level2_thing
matches. Note that you can only do this because you know that either and_level2_thing
will match, or the other alternative will fail as well. If you do this wrong you end up with the parser getting lost and refusing an input that would be valid if it chose the right alternative.