0

I have two finite state acceptors, a.fst and b.fst. They are both unweighted, simply encoding sequences of words. Their symbol (word) lists have some overlap, but are not identical. Now I want a union of these two fst's.

I think some kind of symbol normalization/mapping should be done before calling fstunion. But not sure how to do that.

Jiaji Huang
  • 311
  • 4
  • 14
  • You'll first need to ensure they use the same alphabet -- usually just the union of the alphabets. – Chris Dodd Aug 18 '22 at 18:27
  • yeah, I can union their alphabets first, then how can I "remapping" the arcs using this new alphabet? – Jiaji Huang Aug 18 '22 at 18:29
  • None of the arcs change -- they use the same symbol from the original alphabet which is still present in the new alphabet. For the states, the easiest is to add a new start state with epsilon transitions to the start states of the original FSMs. You now have a valid NFA, which you can convert to a DFA if you wish. – Chris Dodd Aug 18 '22 at 18:33
  • Is this process doable by calling openfst in bash? Could you share the commands? – Jiaji Huang Aug 18 '22 at 18:37
  • I'm not familiar with `openfst` but given that it is a general tool for fsts I imagine it would support doing these things (and probably union directly) – Chris Dodd Aug 18 '22 at 18:55

0 Answers0