Going from a language to a context free grammar

Question

Given the language K = {e^h f^i | 2h > i > h} I need to generate a context free grammar

Some production rules I came up with are: S -> eeTfff and T -> eTff | ϵ

They only work when n = m + 1, but I don't know how to generate any rules for every combination in 2h > i > h.

Your solution seems okay, although it wouldn't accept an empty string. Can you explain the problem more clearly? Because `n` and `m` are nowhere in the language description. — Dekker1, Oct 18 '17 at 02:23
This question is remarkably similar to one asked a few minutes earlier on [math.se]: https://math.stackexchange.com/questions/2477734/context-free-grammar-from-language. Since this question has a lot more to do with formal language theory than *programming*, [math.se] is probably the best place to find an answer. — rici, Oct 18 '17 at 02:28

score 1 · Answer 1 · answered Oct 18 '17 at 17:14

First, identify the shortest string in the language. We need i > h, so we might guess h = 0; however, that leads nowhere since we cannot satisfy 2h > i. We run into the same thing with h - 1. Choosing h = 2, the only choice for i is 3. So the shortest string in the language is eefff. There cannot be any other strings of length 5.

To get a longer string, we can add e's on the front or f's on the end. Clearly, if we add an e on the front, we must always add at least one f on the end, and never more than two f's. We can confirm that e.eefff.f and e.eefff.ff are both in our language. This suggests a grammar:

S -> eefff | eSf | eSff

Once you get a candidate, you can try to prove it using mathematical induction. In our case:

Base case: the shortest string in the language eefff is given by the production S -> eefff.

Induction hypothesis: assume the grammar generates all strings in the language, and that everything the grammar generates is in the language, for all strings of length no more than k.

Induction step: we must show that (1) strings of length k+1 generated by the grammar are in the language and (2) strings of length k+1 in the language are generated by the grammar.

a string of length k+1 generated by the grammar was generated using either S -> eSf or S -> eSff. In the first case, the string derived from the S on the RHS has length k-1; in the second, it has length k-2. In both cases, the strings are in the language by the induction hypothesis. That is, h < i < 2h. But then (h+1) < (i+1) < (i+2) < 2(h+1), so in either case, the string is still in the language.
consider any string of length k+1 in the language. We have h + i = k + 1 and h < i < 2h. Any such string must begin with some number of e's and end with some number of f's. Consider the substrings of lengths k-1 and k-2. Formed by excluding the first e and the last one and two f's. Either the former is a string in the language, or the latter is. To see this, assume neither was. But then neither (h-1) < (i-1) < 2(h-1) nor (h-1) < (i-2) < 2(h-1). That is:
```
((i <= h) or (i >= 2h - 1))
```
and ((i <= h+1) or (i >= 2h))

Since we know h < i < 2h since our string is in L, we can eliminate the 1st and 4th condition. What remains is never satisfied. This demonstrates at least one of the substrings is also in the language. By the induction hypothesis, that string is generated by the grammar. To get the string of length k+1 from that string, simply apply either S -> eSf or S -> eSff, depending on which substring is in the language (both may be).

Going from a language to a context free grammar

1 Answers1