-1

Consider the following state diagram which accepts the alphabet {0,1} and accepts if the input string has two consecutive 0's or 1's:

enter image description here

01001 --> Accept
101 --> Reject

How would I write the production rules to show this? Is it just:

D -> C0 | B1 | D0 | D1
C -> A0 | B0
B -> A1 | C1

And if so, how would the terminals (0,1) be differentiated from the states (A,B,C) ? And should the state go before or after the input? That is, should it be A1 or 1A for example?

David542
  • 104,438
  • 178
  • 489
  • 842
  • It's hard to give a definitive answer because depending on which book you are reading, there might be slightly different definitions for those formalisms – derpirscher Dec 25 '22 at 07:30
  • @derpirscher I suppose "any" would be fine -- I just made up the above diagram, it's not from any book. – David542 Dec 25 '22 at 07:53

1 Answers1

1

The grammar you suggest has no A: it's not a non-terminal because it has no production rules, and it's not a terminal because it's not present in the input. You could make that work by writing, for example, C → 0 | B 0, but a more general solution is to make A into a non-terminal using an ε-rule: A → ε and then C → A 0 | B 0.

B0 is misleading, because it looks like a single thing. But it's two grammatical symbols, a non-terminal (B) and a terminal 0.

With those modifications, your grammar is fine. It's a left linear grammar; a right linear grammar can also be constructed from the FSA by considering in-transitions rather than out-transitions. In this version, the epsilon production corresponds to final states rather than initial states.

A → 1 B | 0 C
B → 0 C | 1 D
C → 1 B | 0 D
D → 0 D | 1 D | ε

If it's not obvious why the FSM corresponds to these two grammars, it's probably worth grabbing a pad of paper and constructing a derivation with each grammar for a few sample sentences. Compare the derivations you produce with the progress through the FSM for the same input.

rici
  • 234,347
  • 28
  • 237
  • 341
  • thanks for this a few follow-up questions: (1) Why do you have `| ε` in the final `D` production? If there weren't a self-loop on `1,0` would the `ε` be optional? (2) Is there a standard way to differentiate between terminals and non-terminals -- for example `A -> '1' B` -- or is it not done in-line, and it's in the `V` (non-teminal) vs `Σ` (terminals)? – David542 Dec 26 '22 at 21:35
  • also as a side-note why did you delete the answer on https://stackoverflow.com/questions/74911725/difference-in-token-types? I found that answer, like all of your others, as great. – David542 Dec 26 '22 at 21:37
  • 1
    (1) because it's a final state. So if you reach it, you're there and you don't need another input. (2) yes, by definition. Although you can also do it implicitly: a non-terminal is on the LHS of some production and a terminal isn't. But that implicit definition only works if the grammar has been reduced to remove unreachable and non-productive non-terminals, so the formal model insists that the sets V and Σ be defined and that every grammar symbol be in exactly one of those sets. – rici Dec 26 '22 at 21:41
  • 1
    ... i wasn't very happy with it so I wanted to rethink it a bit. – rici Dec 26 '22 at 21:42