3

To remove the left recursion

E->E+T|E-T|T
T->T*F|T/F|F

for + and *, I am sure it should be

E->TE'
E'->+TE'|(e) (e) is empty string
T->FT'
T'->*FT'|(e)

but for - or /, I am not sure how to remove left recursion, and I came up with the following one, is it right for - and /? Take an example, for plus, a+b = b+a, but for minus, a - b != b -a. So if we use the following right recursive, do we got the problem like a-b?

E->TE'
E'->+TE'|-TE'|(e) 
T->FT'
T'->*FT'|/FT'|(e)

Anyone know compiler explains to me?Thanks in advance.

David
  • 1,646
  • 17
  • 22
  • What do you mean with `4-3`? Do you have an interpreter which interpretes this grammar? And it returns `-1` instead of `1`? The problem is probably not part of the grammar but of the interpreter. If so, then you should provide more information on the parser generator and the interpreter code. – CoronA Oct 25 '15 at 18:36
  • I think you meant `T->T*F|T/F|F` (rather than `E/T`) – rici Oct 25 '15 at 18:36
  • @rici yes, that's what I mean – David Oct 25 '15 at 18:38
  • @CoronA it's not about interpreter. I edited my question. – David Oct 25 '15 at 19:02
  • Not clear what your question is -- yes, the final grammar you show is the correct left-factorization of your initial grammar. It can parse `+`, `-`, `*`, or `/`, just as the original grammar can, and recognizes the exact same language. Obviously the parse tree for any given input will be different. – Chris Dodd Oct 25 '15 at 19:14
  • @David: The syntax tree may have the wrong associativity. But it is common that semantic analysis transforms the syntax tree, such that the correct associativity is achieved. And this is not a bug of the grammar, but of the semantic analsis/interpreter. Besides - correct the rule 2 as rici recommended. – CoronA Oct 25 '15 at 19:15

1 Answers1

4

Left-recursion elimination allows an LL parser to correctly recognize a language, but the resulting parser does not produce the correct parse tree. In particular, it changes left-associative parses for operators such as - and / with right-associative parses.

In order to use the parse to actually interpret the parsed strings, you need to recover the correct parse tree, effectively by reversing the associativity for left-associative operators.

Alternatively, you could just use a bottom-up parser such as an LALR(1) parser generated by yacc/bison. Or you could write or adapt the operator-precedence algorithm (see "Shunting Yard").

If you're going to use the LL grammar in a recursive descent parser, the problem can be avoided since the recursive descent parser typically has an explicit loop instead of a recursion on the right-recursive production (in pseudo-code):

parse_term(): 
  f = parse_factor()
  while peek() is in ('*', '/'):
    op = token()
    f2 = parse_factor()
    f = apply_operator(op, f, f2)
  return f
rici
  • 234,347
  • 28
  • 237
  • 341
  • Thanks for your reply, I edited my question. It was confusing originally. – David Oct 25 '15 at 19:01
  • 1
    @David: I think my answer is still correct. Basically: Yes, it is a problem that the parse tree is not correct, but it can be dealt with; the answer attempts some suggestions as to how. – rici Oct 25 '15 at 21:07