Recently I was learning about Parsing Expression Grammars. After reading some materials on it by searching the web I finally sat down to make my hands dirty. I thought of implementing the hoc program from The Unix Programming Environment by Kernighan and Pike, with PEG. For that matter I chose PEG parser generator peg/leg as it has good documentation and an extensive collection of examples. I took dc.peg from the examples as my starting point and just implemented support for unary minus in phoc version 1.
Now when I write the following productions for unary minus the generated parser is not recognizing ((-9)), ((-9)*3)/4, (8-(-2)*5)/2-(2+2), etc.
Operand <- SGNValue
/ Value
SGNValue <- OPEN? '-' Value CLOSE?
Value <- NUMBER
/ OPEN Sum CLOSE
But if I change SGNValue to the following those expressions are perfectly matched.
SGNValue <- '-' NUMBER
/ '-' OPEN Sum CLOSE
Could anyone here please explain to me what was wrong with the above production rule?
I am also attaching phoc1.peg file with this post.
Thanks and regards Santanu
# Grammar
Expr <- SPACE Sum EOL { printf("%f\n", pop()); }
/ (!(EOL / 'q') .)* EOL { printf("error: quiting hoc1\n"); \
exit(1); }
/ 'q' EOL { printf("bye\n"); exit(0);}
Sum <- Product ( PLUS Product { double r = pop(), l = pop(); \
push(l + r); }
/ MINUS Product { double r = pop(), l = pop(); \
push(l - r); }
)*
Product <- Operand ( TIMES Operand { double r = pop(), l = pop(); \
push(l * r); }
/ DIVIDE Operand { double r = pop(), l = pop(); \
push(l / r); }
)*
Operand <- SGNValue
/ Value
## SGNValue <- OPEN? '-' Value CLOSE?
## This production rule was throwing error
## for expressions like ((-9)), ((-9)*3)/4,
## (8-(-2)*5)/2-(2+2), etc. probably due to
## left recurssion. But it was behaving as expected
## for (((9))), ((9)*2), etc.
SGNValue <- '-' NUMBER { push(-1*atof(yytext)); }
/ '-' OPEN Sum CLOSE { double d = pop(); push(-1*d); }
Value <- NUMBER { push(atof(yytext)); }
/ OPEN Sum CLOSE
# Lexemes
NUMBER <- < [0-9]+ ('.' [0-9]+)? > SPACE
PLUS <- '+' SPACE
MINUS <- '-' SPACE
TIMES <- '*' SPACE
DIVIDE <- '/' SPACE
OPEN <- '(' SPACE
CLOSE <- ')' SPACE
SPACE <- [ \t]*
EOL <- '\n' / '\r\n' / '\r'