3

I'm looking for a test suite of deterministic finite automata to be used for testing the correctness of DFA minimization algorithms. Could you give me some pointers? Or are there algorithms/implementations available that will generate such automata?

To win the bounty, you'll need to submit a test suite of 400 or more non-minimal automata of various sizes and complexities, at least 20 containing more than 2000 nodes.

If this isn't the right place to ask this question, please direct me to some better places. Thanks.

Peter O.
  • 32,158
  • 14
  • 82
  • 96
ShyPerson
  • 123
  • 1
  • 6

2 Answers2

1

To test correctness you could try converting your minimal DFAs to OpenFst format and testing the equivalence of the minimized accetpors using the equivalence operation.

Paul Dixon
  • 4,201
  • 7
  • 28
  • 27
  • Am I missing something? I can see how this nice tool can verify minimized automata, but I don't see where the initial non-minimized automata come from. – ShyPerson Jan 19 '12 at 05:00
0

Testing "all" DFAs up to n states and m alphabet symbols is infeasible. You could test DFAs with known minimal DFAs; to get (DFA, minimal DFA) pairs, you could generate random REs, get the NFA-lambda usin the algorithm from Kleene's theorem, get a DFA using the subset construction, then minimize with a known correct algorithm for DFA minimization (I assume you accept that the canonical algorithm is correct).

EDIT:

To expand on what I said, here's how I would try to generate a test suite of non-minimal finite automata:

  1. Generate a regular expression using N operations (concatenation, union, Kleene closure).
  2. Use the algorithm from Kleene's theorem to get a NFA-lambda with O(n) states in it.
  3. Use the subset/powerset construction to get a DFA with O(2^n) states in it.
  4. Repeat until you have found a sufficient number of sufficiently complex automata.

Generating the regular expressions is easier. There are a few rules:

  1. a is an RE if a is an alphabet symbol
  2. (rs) is an RE if r, s are REs
  3. (r+s) is an RE if r, s are REs
  4. (r*) is an RE if r is an RE
  5. Nothing else is an RE

To get an RE with n operations, a recursive approach works.

GetRE(ops)
 1. if ops = 0 then return RandomAlphabetSymbol()
 2. select(Rand() % 3)
 3. case 0 then
 4.  ops1 = Rand() % (ops - 1)
 5.  ops2 = (ops - 1) - ops1
 6.  return "(" + GetRE(ops1) + "+" + GetRE(ops2) + ")"
 7. case 1 then
 8.  ops1 = Rand() % (ops - 1)
 9.  ops2 = (ops - 1) - ops1
10.  return "(" + GetRE(ops1) + "." + GetRE(ops2) + ")"
11. case 2 then
12.  return "(" + GetRE(ops - 1) + "*)"

You might find a non-string representation (I.e., a hierarchical linked structure, essentially the parse tree itself) is a more convenient option for applying Kleene's algorithm to get the NFA-lambda.

Patrick87
  • 27,682
  • 3
  • 38
  • 73