All it says is what are the terminals you can expect in a sentential form so that you can replace S
by AB
in the leftmost derivation. So, if A
derives ε
then in leftmost derivation you can replace A
by ε
. So now you depend upon B
and say on. Consider this sample grammar:
S -> AB
A -> ε
B -> h
So, if there is a string with just one character/terminal "h"
and you start verifying whether this string is valid by checking if there is any leftmost derivation deriving the string using the above grammar, then you can safely replace S
by AB
because A
will derive ε
and B
will derive h
.
Therefore, the language recognized by above grammar cannot have a null ε
string. For having ε
in the language, B
should also derive ε
. So now both the non-terminals A
and B
derive ε
, therefore S
derives ε
.
That is, if there is some production S->ABCD
and if all the non-terminals A,B,C and D
derive ε
, then only S
can also derive ε
and therefore ε
will be in FIRST(S)
.
The FIRST sets given by you are correct. I think you are confused since the production S->A
has only one terminal A
on rhs and this A
derives ε
. Now as per b)
FIRST(S) = {FIRST(A) - ε, a,} = {b, a}
which is incorrect. Since rhs has only one terminal so there is this following possibility S -> A -> ε
which specifies that FIRST(S) has ε
or S
can derive a null string ε
.