You define an enum that lists your states - good!
typedef enum {
Start_state = 0,
Build_Id_state = 1,
Identifier_state = 2,
Build_Num_state = 3,
Number_state = 4,
Error_state = 5
} State_type;
Slight change to your state transition code,
int
main(int argc, char** argv) {
State_type currState = 0;
Action_t action;
char* p = *argv; char symbol;
int len = strlen(p);
//C-strings are zero-indexed
for (int i=0; i < len; ++i) {
action = analyzeData(&currState, classify(symbol=*p++));
switch(action) {
case None_act: break;
case Gather_act: //appropriate symbol gathering
case Emit_act: //handle ident/number print/save
case Stop_act: //appropriate behavior, e.g. i=len
...
}
}
}
Build a state transition table holding these entries:
typedef struct state_table_entry_s {
State_type state;
Transition_t trans; //could define as bit-field
State_type nextstate;
Action_t action; //semantic action
} state_table_entry_t;
Define your state transition table, which makes clear that you have not defined behavior for certain transitions. (Make the table two-dimensional, and you can more quickly process state&transition)
state_table_entry_t states[] = {
{Start_state, Letter_class, None_act, Build_Id}
,{Start_state, Digit_class, None_act, Build_Num}
,{Start_state, Blank_class, None_act, Start_state}
,{Start_state, Semicolon_class, Stop_act, Start_state}
,{Build_Id_state, Letter_class, Gather_act, Build_Id_state}
,{Build_Id_state, Digit_class, Gather_act, Build_Id_state}
,{Build_Id_state, Underscore_class, Gather_act, Build_Id_state}
,{Build_Id_state, Blank_class, None_act, Identifier_state}
,{Identifier_state, Blank_class, Emit_act, Start_state}
,{Build_Num_state, Digit_class, Gather_act, Build_Num_state}
,{Build_Num_state, Blank_class, None_act, Number_state}
,{Number_state, Blank_class, Emit_act, Start_state}
,{Stop_state, <any>, Error_act, Stop_state}
,{Error_state, <any>, None_act, Stop_state}
};
Notice how the above 'state transition table' clearly documents your state machine? And you could (easily) load this table from a configuration file?
Stop. Did you define (appropriate) actions for every (State X Transition) pair?
//States:
Start_state
Build_Id_state
Identifier_state
Build_Num_state
Number_state
Error_state
//Transitions:
Letter_class
Digit_class
Underscore_class
Blank_class
Semicolon_class
Other_class
For the above, you need to define your state transition classes:
typedef enum {
Letter_class
,Digit_class
,Underscore_class
,Blank_class
,Semicolon_class
,Other_class
} Transition_t;
And you need to define your actions:
typedef enum {
None_act
,Gather_act
,Emit_act
,Stop_act
,Error_act
} Action_t;
Convert your characters/symbols into their transition class (you could use ctype.h and the isalpha(), isdigit() macros),
Transition_t classify(char symbol) {
Transition_t class = Other_class;
if (isblank(c)) {
return(class = Blank_class); break;
}
else if(isdigit(symbol)) {
return(class = Digit_class);
}
else if (isalpha(symbol)) {
return(class = Letter_class); break;
}
else {
switch(tolower(symbol)) {
case ' ':
return(class = Blank_class); break;
case '_':
return(class = Underscore_class); break;
case ';':
return(class = Semicolon_class); break;
default :
return(class = Other_class); break;
}
}
return(class = Other_class); break;
}
Find your matching state in the state table (you can make this much more efficient), and your matching transition in the transition table, then take the semantic action,
Action_t
analyzeData(State_type& currState, Transition_t class) {
for( int ndx=0; ndx<sizeof(states)/sizeof(states[0]); ++ndx ) {
if( (states[ndx].state == currState)
&&. (states[ndx].trans == class) ) { //state match
semantic_action(states[ndx].action);
currState = states[ndx].nextState;
return(states[ndx].action);
}
}
}
You will need to define your 'semantic_action' function, and of course you will need to 'gather' your input, so that you can perform the output at appropriate action times. And your 'emit_act' will need to cleanup.