I'm working in a software of formal verification of programs, where the user defines an algorithm written in C ++ to be verified. Without going too much into details of the subject matter, I will try to express as clearly as possible what I and my ideas about it.
If the user enters something of the form:
int foo ( [arg1,...,argN] ) {
if ( T_CONDITION ) {
T_EXEC;
}
else {
T_EXEC';
}
}
Then I want to get T_CONDITION
and both T_EXEC
and T_EXEC'
, in the form Parts = [ COND => T_CONDITION, EXEC => [ T_EXEC, T_EXEC' ] ]
, where T_CONDITION
is the entire condition and T_EXEC
are the sentences that the programm executes if the condition is true and T_EXEC'
if the program goes into the else
statement. I think this is called "tokenizer" and its the function of a parser, but I'm not sure. The problem is that I don't know anything about parsers. The problem is I do not know where the condition and executions begins or ends, then I cant deal with the string operations.
Once I have T_CONDITION, I need to break it down in such a way to get several atomic logical formulas. Something like:
T_CONDITION = ( ( A OR N ) OR ( B AND C ) OR ( D AND ( E OR F ) ) )
Then I want to get CONDITION_PARTS = [ [ A ], [ N ], [ B , C ], [ D, [ [ E ], [ F ] ] ] ]
this is: if I get A or B
, then I need PART = [[A],[B]]
and if I get A and B
, then PART = [A,B]
. But how I can recognize which part of the condition belongs each closing parenthesis?
Is this possible?, What tools should I use to do it?, Do you know some guides about this?