5

I'm trying to create a FSM in C that follows this diagram: FSM

and has output that looks like this: enter image description here

I started by defining an enum that held all the possible states:

typedef enum {
Start = 0,
Build_Id = 1,
Identifier = 2,
Build_Num = 3,
Number = 4,
Error = 5
} State_type;

This is the code I currently have:

State_type analyzeData(State_type* currState, char c);

int main(int argc, char** argv) {
  State_type currState = 0;
  for (int i = 1; i < strlen(*argv); ++i) {
    analyzeData(&currState, *argv[i]);
  }
}



State_type analyzeData(State_type* currState, char c) {
  if (currState == 0) {
    if (isblank(c)) {
      *currState = (State_type) 0;
      return *currState;
    }
    else if (isdigit(c)) {
      *currState = (State_type) 3;
      return *currState;
    }
    else if (isalpha(c)) {
      *currState = (State_type) 1;
      return *currState;
    }
  }
}

My plan was to basically use a series of if-else statements for all the other possible states. I guess I'm a little confused on whether I'm even approaching this correctly. I've been trying to read the answers on other FSM questions but nothing is making sense. Can someone point me in the right direction?

Ismael
  • 197
  • 1
  • 2
  • 12
  • 2
    You've gone to the effort of making an enum but then you don't bother using it? `*currState = (State_type) 3;` etc. That is just silly! – John3136 May 04 '18 at 00:36
  • 1
    have you considered using a switch statement? `swtich (*currState) {`. And then having each case like this inside: `case Start`? – bruceg May 04 '18 at 00:38
  • You will have trouble using `argv` that way; you are taking one character per argument. All of which is irrelevant since the problem statement describes taking input from `stdin`. – Doug Currie May 04 '18 at 01:07
  • @DougCurrie i'm a little confused. am I reading the input wrong? – Ismael May 04 '18 at 02:02
  • 1
    @Ismael, you should be reading the characters using `getchar()` or `getc(stdin)`. You'll need to include `stdio.h`. – Doug Currie May 04 '18 at 02:48

2 Answers2

9

You define an enum that lists your states - good!

typedef enum {
    Start_state = 0,
    Build_Id_state = 1,
    Identifier_state = 2,
    Build_Num_state = 3,
    Number_state = 4,
    Error_state = 5
} State_type;

Slight change to your state transition code,

int
main(int argc, char** argv) {
  State_type currState = 0;
  Action_t action;
  char* p = *argv; char symbol;
  int len = strlen(p);
  //C-strings are zero-indexed
  for (int i=0; i < len; ++i) {
    action = analyzeData(&currState, classify(symbol=*p++));
    switch(action) {
        case None_act: break;
        case Gather_act: //appropriate symbol gathering
        case Emit_act: //handle ident/number print/save
        case Stop_act: //appropriate behavior, e.g. i=len
        ...
    }
  }
}

Build a state transition table holding these entries:

typedef struct state_table_entry_s {
    State_type state;
    Transition_t trans; //could define as bit-field
    State_type nextstate;
    Action_t action; //semantic action
} state_table_entry_t;

Define your state transition table, which makes clear that you have not defined behavior for certain transitions. (Make the table two-dimensional, and you can more quickly process state&transition)

state_table_entry_t states[] = {
    {Start_state, Letter_class, None_act, Build_Id}
   ,{Start_state, Digit_class, None_act, Build_Num}
   ,{Start_state, Blank_class, None_act, Start_state}
   ,{Start_state, Semicolon_class, Stop_act, Start_state}
   ,{Build_Id_state, Letter_class, Gather_act, Build_Id_state}
   ,{Build_Id_state, Digit_class, Gather_act, Build_Id_state}
   ,{Build_Id_state, Underscore_class, Gather_act, Build_Id_state}
   ,{Build_Id_state, Blank_class, None_act, Identifier_state}
   ,{Identifier_state, Blank_class, Emit_act, Start_state}
   ,{Build_Num_state, Digit_class, Gather_act, Build_Num_state}
   ,{Build_Num_state, Blank_class, None_act, Number_state}
   ,{Number_state, Blank_class, Emit_act, Start_state}
   ,{Stop_state, <any>, Error_act, Stop_state}
   ,{Error_state, <any>, None_act, Stop_state}
};

Notice how the above 'state transition table' clearly documents your state machine? And you could (easily) load this table from a configuration file?

Stop. Did you define (appropriate) actions for every (State X Transition) pair?

//States:
Start_state
Build_Id_state
Identifier_state
Build_Num_state
Number_state
Error_state

//Transitions:
Letter_class
Digit_class
Underscore_class
Blank_class
Semicolon_class
Other_class

For the above, you need to define your state transition classes:

typedef enum {
    Letter_class
   ,Digit_class
   ,Underscore_class
   ,Blank_class
   ,Semicolon_class
   ,Other_class
} Transition_t;

And you need to define your actions:

typedef enum {
    None_act
   ,Gather_act
   ,Emit_act
   ,Stop_act
   ,Error_act
} Action_t;

Convert your characters/symbols into their transition class (you could use ctype.h and the isalpha(), isdigit() macros),

Transition_t classify(char symbol) {
    Transition_t class = Other_class;
    if (isblank(c)) {
      return(class = Blank_class); break;
    }
    else if(isdigit(symbol)) {
      return(class = Digit_class);
    }
    else if (isalpha(symbol)) {
      return(class = Letter_class); break;
    }
    else {
      switch(tolower(symbol)) {
        case ' ':
            return(class = Blank_class); break;
        case '_':
            return(class = Underscore_class); break;
        case ';':
            return(class = Semicolon_class); break;
        default :
            return(class = Other_class); break;
      }
    }
    return(class = Other_class); break;
}

Find your matching state in the state table (you can make this much more efficient), and your matching transition in the transition table, then take the semantic action,

Action_t
analyzeData(State_type& currState, Transition_t class) {
  for( int ndx=0; ndx<sizeof(states)/sizeof(states[0]); ++ndx ) {
    if( (states[ndx].state == currState)
    &&. (states[ndx].trans == class) ) { //state match
        semantic_action(states[ndx].action);
        currState = states[ndx].nextState;
        return(states[ndx].action);
    }
  }
}

You will need to define your 'semantic_action' function, and of course you will need to 'gather' your input, so that you can perform the output at appropriate action times. And your 'emit_act' will need to cleanup.

ChuckCottrill
  • 4,360
  • 2
  • 24
  • 42
2

You'd probably be better off with a switch statement (or even a transition table), but the basic structure is the same.

If you are using an enumeration, use it. Don't use magic numbers. The point of defining an enumeration is to be able to use meaningful names instead of numbers.

If you are going to return the new state, there is absolutely no point using an in-out parameter. Use the prototype

State_type analyzeData(State_type currState, char c);
/* Better would be int c. See below. */

Then a typical state case might be:

case Start:
  if (isblank(c)) return Start;
  else (isdigit(c)) return Build_Num;
  else (isalpha(c)) return Build_Id;
  else return Error;

Also note that isalpha and friends take an int, not a char. If char is signed (which is common) and the value happens to be negative, Undefined Behaviour will result.

rici
  • 234,347
  • 28
  • 237
  • 341
  • thanks for the clarification. I have another question - after I read in a semicolon, I'm supposed to output all the data items and their correct identifiers. I'm not really sure how to accomplish this. should I create an empty char* and add each successive character into it until I reach a blank space, print it out, clear it, and then repeat? – Ismael May 04 '18 at 02:07