As pointed out in the comments, these grammars are wrong since they generate strings not in the language. Here's a derivation of abcc
in both grammars:
S -> aS -> abA -> abcA -> abccA -> abcc
S -> Sc -> Scc -> Abcc -> Aabcc -> abcc
Also as pointed out in the comments, there is a simple linear grammar for this language, where a linear grammar is defined as having at most one nonterminal symbol in the RHS of any production:
S -> aSc | b
There are some general rules for constructing grammars for languages. These are either obvious simple rules or rules derived from closure properties and the way grammars work. For instance:
- if
L = {a}
for an alphabet symbol a
, then S -> a
is a gammar for L
.
- if
L = {e}
for the empty string e
, then S -> e
is a grammar for L
.
- if
L = R U T
for languages R
and T
, then S -> S' | S''
along with the grammars for R
and T
are a grammar for L
if S'
is the start symbol of the grammar for R
and S''
is the start symbol of the grammar for T
.
- if
L = RT
for languages R
and T
, then S = S'S''
is a grammar for L
if S'
is the start symbol of the grammar for R
and S''
is the start symbol of the grammar for T
.
- if
L = R*
for language R
, then S = S'S | e
is a grammar for L
if S'
is the start symbol of the grammar for R
.
Rules 4 and 5, as written, do not preserve linearity. Linearity can be preserved for left-linear and right-linear grammars (since those grammars describe regular languages, and regular languages are closed under these kinds of operations); but linearity cannot be preserved in general. To prove this, an example suffices:
R -> aRb | ab
T -> cTd | cd
L = RT = a^n b^n c^m d^m, 0 < a,b,c,d
L' = R* = (a^n b^n)*, 0 < a,b
Suppose there were a linear grammar for L
. We must have a production for the start symbol S
that produces something. To produce something, we require a string of terminal and nonterminal symbols. To be linear, we must have at most one nonterminal symbol. That is, our production must be of the form
S := xYz
where x is a string of terminals, Y is a single nonterminal, and z is a string of terminals. If x
is non-empty, reflection shows the only useful choice is a
; anything else fails to derive known strings in the language. Similarly, if z
is non-empty, the only useful choice is d
. This gives four cases:
x
empty, z
empty. This is useless, since we now have the same problem to solve for nonterminal Y
as we had for S
.
x = a
, z
empty. Y
must now generate exactly a^n' b^n' b c^m d^m
where n' = n - 1
. But then the exact same argument applies to the grammar whose start symbol is Y
.
x
empty, z = d
. Y
must now generate exactly a^n b^n c c^m' d^m'
where m' = m - 1
. But then the exact same argument applies to the grammar whose start symbol is Y
.
x = a
, z = d
. Y
must now generate exactly a^n' b^n' bc c^m' d^m'
where n'
and m'
are as in 2 and 3. But then the exact same argument applies to the grammar whose start symbol is Y
.
None of the possible choices for a useful production for S
is actually useful in getting us closer to a string in the language. Therefore, no strings are derived, a contradiction, meaning that the grammar for L
cannot be linear.
Suppose there were a grammar for L'
. Then that grammar has to generate all the strings in (a^n b^n)R(a^m b^m)
, plus those in e + R
. But it can't generate the ones in the former by the argument used above: any production useful for that purpose would get us no closer to a string in the language.