0

I'm trying to construct a regular expression from a Finite Automaton but found my self completely stuck with this one. The regex to use is like this:

? = 0 or 1
* = 0 or more
+= 1 or more
| = or
_ = empty string
@ = empty set
() = parentheses

As I understand the strings must either be "b*" end with "a*" or end with "a+bb+"
What i have now is ((b*(a+(bb))*)*) but that doesn't take into account a string ending with 'a'.

As said, I'm 100% stuck with this and just can't get my head around how I am supposed to work with this.

image: http://img593.imageshack.us/img593/2563/28438387.jpg

CODE:
Type of the automaton
FA

States
q1
q2
q3
q4

Alphabet
a
b

Initial state
q3

Final states
q3
q4

Transitions
q1 a q2
q1 b q3
q2 a q2
q2 b q2
q3 a q4
q3 b q3
q4 a q4
q4 b q1

Any solutions or tips appreciated!

Laurence Gonsalves
  • 137,896
  • 35
  • 246
  • 299
mjuopperi
  • 773
  • 7
  • 25

2 Answers2

2

If you feed this to tools for automata (e.g., Vcsn), you'd get this:

In [1]: import vcsn

In [2]: %%automaton a
   ...: $  -> q3
   ...: q1 -> q2 a
   ...: q1 -> q3 b
   ...: q2 -> q2 a
   ...: q2 -> q2 b
   ...: q3 -> q4 a
   ...: q3 -> q3 b
   ...: q4 -> q4 a
   ...: q4 -> q1 b
   ...: q3 -> $
   ...: q4 -> $
   ...: 
mutable_automaton<letterset<char_letters(ab)>, b>

In [3]: a.expression()
Out[3]: (b+aa*bb)*(\e+aa*)

where \e denotes the empty string. Then it's only a problem of syntax conversion.

Graphically:

Vcsn graphical rendering

See this example live, and toy with it.

akim
  • 8,255
  • 3
  • 44
  • 60
0

It isn't possible to get from q2 to a final state. Remove it and the resulting DFA should be easier to convert.

As I understand the strings must either be "b*" end with "a*" or end with "a+bb+" What i have now is ((b*(a+(bb)))) but that doesn't take into account a string ending with 'a'.

Imagine q3 was not a final state, and q4 was the initial state. What would the regex look like then? Changing that into what you want shouldn't be too hard, just don't be afraid to have the same state and/or transitions described by more than one part of the regex.

One more hint: I'm pretty sure you're going to need to use either ? or | at least once.

Laurence Gonsalves
  • 137,896
  • 35
  • 246
  • 299
  • I've basically ignored q2 and just considered that "ab" must be followed by atleast one more 'b'. (Thus the "end with "a+bb+"") – mjuopperi Dec 20 '10 at 16:12
  • @Gawwad Ok, I added some more hints based on the specific issue you mentioned in your question. – Laurence Gonsalves Dec 20 '10 at 16:43
  • Would it be "(a*(bb+a+)?)*" ? Should I use ()* around the "round" to account for strings such as "abbaabb" or is that necessary? – mjuopperi Dec 20 '10 at 16:56
  • @Gawwad I think that's closer, though you're not matching a lone b with that regex. I'm not sure what you mean by "around the 'round'". It's often easier to convert a regex to a FA than the reverse, so you might ant to try converting your regex to an FA and then stepping through both automata together to see if there are any paths to an accepting state that exist in one but not the other. – Laurence Gonsalves Dec 20 '10 at 17:13
  • I worded that a bit wrong. What I mean is would "a+b+" match "abab" or would I have to use "()*" around the regex to make it match recursively? – mjuopperi Dec 20 '10 at 17:17
  • @Gawwad: No, `a+b+` would not match "abab". It matches one or more a's followed by one or more b's. Yes, `(a+b+)*` would match "abab". – Laurence Gonsalves Dec 20 '10 at 17:27
  • @Gawwad: Sorry, I just realized that in an earlier edit I made didn't say what I meant to say. I meant to say "Imagine q3 was not a final state". Ignore the "and q4 was the initial state" bit. That change should make it easier to get something close to what you want. Though the last regex you had in the comments was not very far at all from being correct either... – Laurence Gonsalves Dec 20 '10 at 17:37
  • With q3 not being final, would that work out to be "(a+(bb+a+)?)*" ? I came up with "((b*(a+bb+)?)|(b*a*))*" for the original problem. But I think that macthes "ab". – mjuopperi Dec 20 '10 at 17:49
  • "b*((a+bb+)|(a*))*" I think should be the correct solution. However, the validator gives me this error "Given regular expression has different alphabet than expected." which makes absolutely no sense to me. – mjuopperi Dec 20 '10 at 19:19
  • @Gawwad:`"(a+(bb+a+)?)*"` isn't quite right. (it doesn't match abbabba, for example). – Laurence Gonsalves Dec 20 '10 at 19:35
  • @Gawwad: Here's the approach I'd use: Roughly, you want a `+` or `*` for each cycle. There are 3 cycles: q4 to itself (`a*`) q3 to itself (`b*`) and the q4 to q1 to q3 to q4 cycle (`*` on everything). So you're going to have something like `(...a*...b*)*` as your "main loop". You (may) then need to add a prefix to get the initial state right, and possibly suffix(es) to the final state(s) right. It may also help to just forget about `+`, as `x+` is just shorthand for `xx*`. You can always go back at the end and compact uses of `*` into `+` where appropriate. – Laurence Gonsalves Dec 20 '10 at 19:37
  • @Gawwad: yeah, that looks like a correct solution to me. There are multiple equivalent solutions. This is a long shot, but try turning the `x+` into `xx*`. Some regex engines don't support `+`. – Laurence Gonsalves Dec 20 '10 at 19:41
  • Thanks a million mate, I've got it now! Final error was the validator not accepting '+'. Have a merry Christmass! :) – mjuopperi Dec 20 '10 at 19:57