Converting a context free grammar into a LL(1)

Question

I have the following grammar:

S -> S+S|SS|S*|(S)|a

How do I convert it into a LL(1) grammar?

I tried to eliminate left recursion so I got

S->(S)S'|aS'
S'->+SS'|SS'|*S'|epsilon

I also tried to do left factoring first and then eliminate left recursion and I got something like :

S->(S)S"|aS"
       S"->S'S"|epsilon
       S'->+S|*|S

But I still do not get a perfect answer. I feel grammar is still not LL(1). Please help.

This question appears to be theoretical rather than practical. (The grammar does not appear to be realistic.) Try http://cs.stackoverflow.com — Raymond Chen, Oct 02 '14 at 13:13
@RaymondChen: It's a (simplified) grammar for regular expressions using `+` for the alternation operator usually written as `|`. Why is that not realistic? — rici, Oct 03 '14 at 02:54
@rici Most languages have more terminals than `a`. If you want a regular expression parser, then just use an existing one - no need to reinvent the wheel. — Raymond Chen, Oct 03 '14 at 06:59
I might be missing something, but how is the second grammar not already LL(1)? — , Nov 10 '15 at 16:14

score 1 · Answer 1 · edited Jun 20 '20 at 09:12

It might help to try writing the grammar so that you read some complete term, then optionally try extending it in some way. For example, you might try something like this:

S → Term

Term → CoreTerm OptMore

CoreTerm → a | (Term)

OptMore → ε | Term | + Term | * OptMore

For example, you'd derive a(a+a)*a as

S

⇒ Term

⇒ CoreTerm OptMore

⇒ a OptMore

⇒ a Term

⇒ a CoreTerm OptMore

⇒ a(CoreTerm OptMore) OptMore

⇒ a(a OptMore) OptMore

⇒ a(a + Term) OptMore

⇒ a(a + CoreTerm OptMore) OptMore

⇒ a(a + a OptMore) OptMore

⇒ a(a + a) OptMore

⇒ a(a + a)* OptMore

⇒ a(a + a)* Term

⇒ a(a + a)* CoreTerm OptMore

⇒ a(a + a)* a OptMore

⇒ a(a + a)* a

To see that this is an LL(1) grammar, here's the FIRST sets:

FIRST(S) = {
FIRST(Term) = { a, ( }
FIRST(CoreTerm) = { a, ( }
FIRST(OptMore) = { ε, a, (, +, * }

Here's the FOLLOW sets:

FOLLOW(S) = {$}
FOLLOW(Term) = {$, )}
FOLLOW(CoreTerm) = {a, (, +, *, $}
FOLLOW(OptMore) = {$, )}

So now we can fill in the parse table:

         | a                | (                | +      | *         | )   | $
---------+------------------+------------------+--------+-----------+-----+------
S        | Term             | Term             |        |           |     |
Term     | CoreTerm OptMore | CoreTerm OptMore |        |           |     |
CoreTerm | a                | (Term)           |        |           |     |
OptMore  | Term             | Term             | + Term | * OptMore | eps | eps

So this grammar is indeed LL(1).

Converting a context free grammar into a LL(1)

1 Answers1