Given the following language described as:
- formally:
(identifier operator identifier+)*
- in plain English: zero or more operations written as an identifier (the lvalue), then an operator, then one or more identifiers (the rvalue)
An example of a sequence of operations in that language would be, given the arbitrary operator @
:
A @ B C X @ Y
Whitespace is not significant and it may also be written more clearly as:
A @ B C
X @ Y
How would you parse this with a yacc-like LALR parser ?
What I tried so far
I know how to parse explicitly delimited operations, say A @ B C ; X @ Y
but I would like to know if parsing the above input is feasible and how. Hereafter is a (non-functional) minimal example using Flex/Bison.
lex.l:
%{
#include "y.tab.h"
%}
%option noyywrap
%option yylineno
%%
[a-zA-Z][a-zA-Z0-9_]* { return ID; }
@ { return OP; }
[ \t\r\n]+ ; /* ignore whitespace */
. { return ERROR; } /* any other character causes parse error */
%%
yacc.y:
%{
#include <stdio.h>
extern int yylineno;
void yyerror(const char *str);
int yylex();
%}
%define parse.lac full
%define parse.error verbose
%token ID OP ERROR
%left OP
%start opdefs
%%
opright:
| opright ID
;
opdef: ID OP ID opright
;
opdefs:
| opdefs opdef
;
%%
void yyerror(const char *str) {
fprintf(stderr, "error@%d: %s\n", yylineno, str);
}
int main(int argc, char *argv[]) {
yyparse();
}
Build with: $ flex lex.l && yacc -d yacc.y --report=all --verbose && gcc lex.yy.c y.tab.c
The issue: I cannot get the parser to not include the next lvalue identifier to the rvalue of the first operation.
$ ./a.out
A @ B C X @ Y
error@1: syntax error, unexpected OP, expecting $end or ID
The above is always parsed as: reduce(A @ B reduce(C X)) @ Y
I get the feeling I have to somehow put a condition on the lookahead token that says that if it is the operator, the last identifier should not be shifted and the current stack should be reduced:
A @ B C X @ Y
^ * // ^: current, *: lookahead
-> reduce 'A @ B C' !
-> shift 'X' !
I tried all kind of operator precedence arrangements but cannot get it to work.
I would be willing to accept a solution that does not apply to Bison as well.