I am trying to implement a regex to NFA converter. I have most of the code written, but I am struggling to find a way to build a graph with a cycle given my representation for states (nodes) and edges.
My graph representation is as follows:
type state =
| State of int * edge list (* Node ID and outgoing edges *)
| Match (* Match state for the NFA: no outgoing edges *)
and edge =
| Edge of state * string (* End state and label *)
| Epsilon of state (* End state *)
My function to convert a regex to an NFA is basically a pattern match on the type of regex, taking in the regex type and the "final state" (where all the outgoing edges for the NFA will go) and returning the "start state" of the (partially built) NFA for that regex. The NFA fragments are built by returning a State constructed with its outgoing edge list, where each edge's end state is constructed via a recursive call.
Most of the code is easy, but I am having trouble building the NFA for Kleene star and +, which require cycles in the graph. Given my representation I end up with something like:
let rec regex2nfa regex final_state =
match regex with
... (* Other cases... *)
| KleeneStar(re) ->
let s = State(count, [Epsilon(regex2nfa r s); Epsilon(final_state)]) in
s
Obviously this doesn't compile as s is undefined at this point. However I also cannot add the "rec" keyword because the type checker will (rightfully) reject such a recursively defined type, and I can't get around this by using Lazy because forcing the evaluation of "s" will recursively force it (again and again...). Basically I have a chicken and egg problem here - I need to pass the "state" reference before it is fully constructed to a another state that will have an edge back to it, but of course the original state must be fully constructed to be passed in the recursive call.
Is there anyway way to do this without using references/mutable records? I would really like to keep this as functional as possible but I don't see a way around this given the situation... Anyone have suggestions?