0

Not quite sure if this is the correct forum, but it was suggested at Theoretical Computer Science that I move it here...

What is the typical alphabet size of Finite State Machines?

I am currently busy implementing a high-performance FA library and need to make some design considerations before continuing. My state space will be in the order of 2 147 483 647 (Integer.MAX_VALUE) which I feel is more than enough, even for non-general use. Now, all that remains is the alphabet space.

Is there any merit in assuming that the alphabet would usually only consist of all displayable characters (in which case it can be stored as a byte which would result in really good performance)? Or should alphabet symbols rather be translated into Strings so that you rather have alphabet labels? In this case I would need to keep a Map that translates a String into either a int, short or byte, depending on how large I want to make it.

Nico Huysamen
  • 10,217
  • 9
  • 62
  • 88

1 Answers1

2

Really the alphabet of a finite state machine is a mathematical 'set' of any type. There is nothing restricting the content of the set, it could be 1's and 0's, A-Z, or apples-oranges. There is no 'typical' FSM alphabet size as per se. Do you have a user in mind for your library?

Eric
  • 601
  • 7
  • 22
  • I realize the theoretical boundaries of the alphabet. I am more thinking in terms of optimization / performance, how large *should* I enable the alphabet to grow. The users will mostly be researchers seeking empirical data. – Nico Huysamen Mar 17 '11 at 08:22
  • @Nico - Still depends on the researchers and the data involved. Why not make a few different implementations based on different approximate set sizes, can't really be that much more code... – Eric Mar 17 '11 at 08:48
  • Seeing as this thread seems to have died, I will mark this as the correct answer. I have decided on restricting the alphabet to 256 at the moment, but designed to be easily changed later. – Nico Huysamen Mar 24 '11 at 10:57