3

I have the following grammar:

S -> S+S|SS|S*|(S)|a

How do I convert it into a LL(1) grammar?

I tried to eliminate left recursion so I got

S->(S)S'|aS'
S'->+SS'|SS'|*S'|epsilon

I also tried to do left factoring first and then eliminate left recursion and I got something like :

S->(S)S"|aS"
       S"->S'S"|epsilon
       S'->+S|*|S

But I still do not get a perfect answer. I feel grammar is still not LL(1). Please help.

Alex K.
  • 171,639
  • 30
  • 264
  • 288
arpat
  • 41
  • 3
  • This question appears to be theoretical rather than practical. (The grammar does not appear to be realistic.) Try http://cs.stackoverflow.com – Raymond Chen Oct 02 '14 at 13:13
  • @RaymondChen: It's a (simplified) grammar for regular expressions using `+` for the alternation operator usually written as `|`. Why is that not realistic? – rici Oct 03 '14 at 02:54
  • @rici Most languages have more terminals than `a`. If you want a regular expression parser, then just use an existing one - no need to reinvent the wheel. – Raymond Chen Oct 03 '14 at 06:59
  • I might be missing something, but how is the second grammar not already LL(1)? –  Nov 10 '15 at 16:14

1 Answers1

1

It might help to try writing the grammar so that you read some complete term, then optionally try extending it in some way. For example, you might try something like this:

S → Term

Term → CoreTerm OptMore

CoreTerm → a | (Term)

OptMore → ε | Term | + Term | * OptMore

For example, you'd derive a(a+a)*a as

S

⇒ Term

⇒ CoreTerm OptMore

⇒ a OptMore

⇒ a Term

⇒ a CoreTerm OptMore

⇒ a(CoreTerm OptMore) OptMore

⇒ a(a OptMore) OptMore

⇒ a(a + Term) OptMore

⇒ a(a + CoreTerm OptMore) OptMore

⇒ a(a + a OptMore) OptMore

⇒ a(a + a) OptMore

⇒ a(a + a)* OptMore

⇒ a(a + a)* Term

⇒ a(a + a)* CoreTerm OptMore

⇒ a(a + a)* a OptMore

⇒ a(a + a)* a

To see that this is an LL(1) grammar, here's the FIRST sets:

  • FIRST(S) = {
  • FIRST(Term) = { a, ( }
  • FIRST(CoreTerm) = { a, ( }
  • FIRST(OptMore) = { ε, a, (, +, * }

Here's the FOLLOW sets:

  • FOLLOW(S) = {$}
  • FOLLOW(Term) = {$, )}
  • FOLLOW(CoreTerm) = {a, (, +, *, $}
  • FOLLOW(OptMore) = {$, )}

So now we can fill in the parse table:

         | a                | (                | +      | *         | )   | $
---------+------------------+------------------+--------+-----------+-----+------
S        | Term             | Term             |        |           |     |
Term     | CoreTerm OptMore | CoreTerm OptMore |        |           |     |
CoreTerm | a                | (Term)           |        |           |     |
OptMore  | Term             | Term             | + Term | * OptMore | eps | eps

So this grammar is indeed LL(1).

Community
  • 1
  • 1
templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065