0

How would I apply the FIRST() rule on a production such as :

A -> AAb | Ab | s

where A is a non-terminal, and b,s are terminals.

FIRST(A) of alternatives 1 & 2 would be A again, but such would end in infinite applications of FIRST, since I need a terminal to get the FIRST set?

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
FinalFortune
  • 635
  • 10
  • 25

3 Answers3

1

To compute FIRST sets, you typically perform a fixed-point iteration. That is, you start off with a small set of values, then iteratively recompute FIRST sets until the sets converge.

In this case, you would start off by noting that the production A → s means that FIRST(A) must contain {s}. So initially you set FIRST(A) = {s}.

Now, you iterate across each production of A and update FIRST based on the knowledge of the FIRST sets you've computed so far. For example, the rule

A → AAb

Means that you should update FIRST(A) to include all elements of FIRST(AAb). This causes no change to FIRST(A). You then visit

A → Ab

You again update FIRST(A) to include FIRST(Ab), which is again a no-op. Finally, you visit

A → s

And since FIRST(A) already contains s, this causes no change.

Since nothing changed on this iteration, you would end up with FIRST(A) = {s}, which is indeed correct because any derivation starting at A ultimately will produce an s as its first character.

For more information, you might find these lecture slides useful (here's part two). They describe in detail how top-down parsing works and how to iteratively compute FIRST sets.

Hope this helps!

Apalala
  • 9,017
  • 3
  • 30
  • 48
templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • @Apalala- Are you sure about this? The symbol `b` is definitely not in FIRST(A). The algorithm I know of for computing FIRST sets works by seeding the FIRST set with all productions starting with a terminal, then proceeding from there. – templatetypedef Mar 18 '13 at 15:50
  • @Apalala- Alternatively, if you don't seed everything that way, you don't compute FIRST sets by looking past the FIRST set of a nonterminal unless that nonterminal can produce epsilon. This would mean that the update rule would add FIRST(A) to FIRST(A), which would add the empty set to itself. – templatetypedef Mar 18 '13 at 15:51
  • You could seed FIRST with terminals for rules of the form `A -> b`, but that's not general enough. But the reason for my downvote is that you suggest that when calculating FIRST from `A -> AAb` one needs to look only at the first `A`, and that's wrong, because, though not in this case, it could be that `A =*=> ε`, which would make `FIRST(A)` contain `b`. Please see my answer for the general solution to *FIRST*. – Apalala Mar 18 '13 at 16:15
  • @Apalala- Having taught a compilers course twice, I'm fairly confident that you do **not** look past a nonterminal when doing FIRST set computation unless you have explicitly found that the FIRST set for that nonterminal contains epsilon. You don't optimistically look past nonterminals. Does that make sense? – templatetypedef Mar 18 '13 at 16:19
  • You are right, for FIRST[1], but not for FIRST[k]. I edited your answer to make it general, and correct. I hope you don't mind. – Apalala Mar 18 '13 at 16:25
-1

My teaching notes are in Spanish, but the algorithms are in English. This is one way to calculate FIRST:

foreach a ∈ Σ do
     F(a) := {a}
for each A ∈ N do
     if A→ε ∈ P then
          F(A) := {ε}
     else
          F(A) := ∅
repeat
     for each A ∈ N do
          F'(A) := F(A)
     for each A → X1X2...Xn ∈ P do
          if n > 0 then
               F(A) := F(A) ∪ F'(X1) ⋅k F'(X2) ⋅k ... ⋅k F'(Xn)
until F(A) = F'(A) forall A ∈ N
FIRSTk(X) := F(X) forall X ∈ (Σ ∪ N)

Σ is the alphabet (terminals), N is the set of non-terminals, P is the set of productions (rules), ε is the null string, and ⋅k is concatenation trimmed to k places. Note that ∅ ⋅k x = ∅, and that concatenating two sets produces the concatenation of the elements in the Cartesian product.

The easiest way to calculate FIRST sets by hand is by using one table per algorithm iteration.

F(A) = ∅

F'(A) = F(A) ⋅1 F(A) .1 F(b) U F(A) .1 F(b) U F(s)
F'(A) = ∅  ⋅1 ∅ ⋅1 {b} U ∅  ⋅1 {b} U {s}
F'(A) = ∅ U ∅ U {s} 
F'(A) = {s}

F''(A) = F'(A) ⋅1 F'(A) .1 F'(b) U F'(A) .1 F'(b) U F'(s)  
F''(A) = {s} ⋅1 {s} ⋅1 {b} U {s} ⋅1 {b} U {s}
F''(A) = {s} U {s} U {s}
F''(A) = {s}

And we're done, because F' = F'', so FIRST = F'', and FIRST(A) = {s}.

Apalala
  • 9,017
  • 3
  • 30
  • 48
-2

your grammar rule has left recursion as you already realized and LL parsers are not able to parse grammars with left recursion.

So you need to get rid of left recursion first and then you should be able to compute the first set for the rule.

none
  • 11,793
  • 9
  • 51
  • 87
  • While this is true, I think the OP's question is about computing FIRST sets, which is possible even if the grammar is left-recursive. – templatetypedef Mar 18 '13 at 15:43
  • The question is about calculating the *FIRST* set, and that can be done for left-recursive grammars. – Apalala Mar 18 '13 at 15:51