I have a grammar with an LR(1) conflict which I cannot resolve; yet, the grammar should be unambiguous. I'll first demonstrate the problem on a simplified grammar with five tokens: (
, )
, {}
, ,
and id
.
The EBNF would look like this:
args = ( id ',' )*
expression = id
| '(' expression ')'
| '(' args ')' '{}'
The grammar is unambiguous and requires at most two tokens of lookahead. When (
is shifted, there are only five possibilities:
(
→ Recur.)
→ Reduce as'(' args ')'
.id
)
not{}
→ Reduce as'(' expression ')'
.id
)
{}
→ Reduce as'(' args ')' '{}'
id
,
→ Reduce as'(' args ')' '{}'
(eventually).
A naive translation yields the following result (and conflicts):
formal_arg: Ident
{}
formal_args: formal_arg Comma formal_args
| formal_arg
| /* nothing */
{}
primary: Ident
| LParen formal_args Curly
| LParen primary RParen
{}
So, the grammar requires at most three tokens of lookahead to decide. I know that an LR(3) grammar can be transformed to LR(1) grammar.
However, I don't quite understand how to do the transformation in this particular case. Note that the simplified grammar above is an extraction from a larger body of code; in particular, is it possible to transform primary
without touching expr
and everything above?