Regular Grammar to my Regex/DFA

Question

I have following regular expression: ((abc)+d)|(ef*g?)

I have created a DFA (I hope it is correct) which you can see here

http://www.informatikerboard.de/board/attachment.php?attachmentid=495&sid=f4a1d32722d755bdacf04614424330d2

The task is to create a regular grammar (Chomsky hierarchy Type 3) and I don't get it. But I created a regular grammar, which looks like this:

S → aT

T → b

T → c

T → dS

S → eT

S → eS

T → ε

T → f

T → fS

T → gS

Best Regards Patrick

Isn't type 3 Chomsky exactly the class of the regular grammars allowing rules `A -> aA` and `A -> a` only? If so, this is already in Chomsky form... — ShellFish, Jan 14 '15 at 17:51
I don't know, it is like I said: I don't get it... thats why I ask. — AustriaWien, Jan 14 '15 at 17:56
Your DFA is correct :) Don't be shy to use many non-terminals sometimes you'll need a lot to make a regex work. — ShellFish, Jan 14 '15 at 18:16
My pleasure, you can thank me by accepting the answer and upvote (only) if you like it! — ShellFish, Jan 14 '15 at 18:26
Important side note: the epsilon rule `S -> eps` can ONLY be a valid rule in type 3 Chomsky if the empty string is accepted by the regex!! — ShellFish, Jan 14 '15 at 18:27
I've got another question, you choosed state names like S, T, U and so on... does it matter how these states are named? can I name them like I named my states in the DFA? it would look like this: Z1 → aZ2 Z2 → bZ3 Z3 → cZ4 Z4 → cZ2 Z4 → dZ5 Z5 → ε Z1 → eZ6 Z6 → ε Z6 → fZ7 Z6 → gZ8 Z7 → ε Z7 → fZ7 Z7 → gZ8 Z8 → ε ε = Final / empty state? — AustriaWien, Jan 14 '15 at 18:27
You can name them however you like! Normally no mathematician would worry about semantics like that, as long as you're consistent! — ShellFish, Jan 14 '15 at 18:28
Allright I got it. But in your example: V -> d wouldn't it be correct to do it like this: V -> dW W -> ε Isn't ε a final state? — AustriaWien, Jan 14 '15 at 18:46
Well epsilon is not a state, it's a transition. You wouldn't write epsiolon in a vertice but on an edge. It means an empty input. In the context of grammars there aren't really states only terminals and non-terminals. When you have a rule `A -> a` the string stops building because there isn't a new non-terminal (hence terminal <-> terminating). Having a rule `A -> eps` means you add an empty input and you can no longer at anything to it since epsilon is a terminal char. — ShellFish, Jan 14 '15 at 19:33
I would like to tell you more but the comments are beginning to spam so bundle some questions and create a new question or enter chat. Please also accept the answer! — ShellFish, Jan 14 '15 at 19:34
Okay one last question regarding this topic. In your solution of the second part ef*g? Don't you miss anything? So wouldn't it be correct like this: Z1->eZ6 Z1->e Z6->fZ7 Z6->f Z6->g Z7-fZ7 Z7->g ? — AustriaWien, Jan 16 '15 at 19:19
No its correct I believe, yours is correct too but not minimal. What string can't you make or which can you make that is invalid? — ShellFish, Jan 16 '15 at 19:27
Sorry, I don't get your last sentence? So my solution isn't wrong? — AustriaWien, Jan 16 '15 at 19:42
No it isn't wrong it's just not optimal. The fewer rules the more optimal. — ShellFish, Jan 16 '15 at 19:50

ShellFish · Accepted Answer · 2015-01-14T18:18:12.427

Type 3 Chomsky are the class of regular grammars constricted to the use of following rules:

X -> aY
X -> a,

in which X is an arbitrary non-terminal and a an arbitrary terminal. The rule A -> eps is only allowed if A is not present in any of the right hand sides.

Construction

We notice the regular expression consists of two possibilities, either (abc)+d or ef*g?, our first rules will therefor be S -> aT and S -> eP. These rules allow us to start creating one of the two possibilities. Note that the non-terminals are necessarily different, these are completely different disjunct paths in the corresponding automaton. Next we continue with both regexes separately:

(abc)+ We have at least one sequence abc followed by 0 or more occurrences, it's not hard to see we can model this like this:

S -> aT
T -> bU
U -> cV
V -> aT   # repeat pattern
V -> d    # finish word

ef*g? Here we have an e followed by zero or more f characters and an optional g, since we already have the first character (one of the first two rules gave us that), we continue like this:

S -> eP
S -> e    # from the starting state we can simply add an 'e' and be done with it,
          # this is an accepted word!
P -> fP   # keep adding f chars to the word
P -> f    # add f and stop, if optional g doesn't occur
P -> g    # stop and add a 'g'

Conclusion

Put these together and they will form a grammar for the language. I tried to write down the train of thought so you could understand it.

As an exercise, try this regex: (a+b*)?bc(a|b|c)*

Regular Grammar to my Regex/DFA

1 Answers1