I am trying to write a top-down recursive-descent parser for a small language, and I am facing some issues with the assignment statements. Here is the grammar from the language specifications:
<assign_stmt> ::= <lvalue> <l_tail> ":=" <expr> ";"
<l_tail> ::= ":=" <lvalue> <l_tail>
| ""
<expr> ::= ....
#multiple layers betwen <expr> and <lvalue>, like <term>, <factor>, etc.
#in the end, <expr> can be a <lvalue>
| <lvalue>
so that the assignments can look like
a := b := 3;
c := d := e := f;
The grammar does not seem to be ambiguous, but it is causing me issues because <expr>
can itself be a <lvalue>
. When parsing <l_tail>
, both production rules are equally valid and I don't know which one to pick. I tried various left-factorizations (see below), but so far, I have not been able to find a LL(1)
grammar that works for me. Is it even possible here?
<assign_stmt> ::= <lvalue> ":=" <rest>
<rest> ::= <expr> ";"
| <lvalue> ":=" <l_tail>
Note that I could go around this issue by parsing <l_tail>
and then looking for the ";"
token. Depending on the result, I would know whether the last <lvalue>
was actually an <expr>
or not (without having to backtrack). However, I am learning here, and I would like to know the "right" way to overcome this problem.