2

I'm trying to get a hang of Jison. I'm having a bit of a trouble though. The following parser always returns [], no matter what you give it.

%lex
%%

"data"\s*             return 'DATA'
[A-Za-z][A-Za-z0-9_]* return 'IDENTIFIER'
[0-9]+("."[0-9]+)?\b  return 'NUMBER'
"="                   return 'ASSIGN'
("<-"|"<+-")          return 'REACT'
"+"                   return '+'
"-"                   return '-'
"*"                   return '*'
"/"                   return '/'
"^"                   return '^'
\n+                   return 'NL'
<<EOF>>               return 'EOF'
.                     return 'INVALID'

/lex

%token NL

/* operator associations and precedence */

%left ASSIGN
%left REACT
%left '+' '-'
%left '*' '/'
%left '^'
%left UMINUS

%start program

%% /* language grammar */

program
    :
        {return [];}
    | program statement
        {return $1.concat([$2]);}
    | program statement EOF
        {return $1.concat([$2]);}
    ;

statement
    : assign NL
        {return $1;}
    ;

assign
    : IDENTIFIER ASSIGN expression
        {return ['assign', $1, $3];}
    | IDENTIFIER REACT expression
        {return ['react', $1, $2, $3];}
    ;

expression
    : NUMBER
        {return +$1;}
    | IDENTIFIER
    ;

The problem is obviously in my definition of the non-terminal program. What would be the proper way to declare it?

seequ
  • 456
  • 2
  • 13
  • I would split `program` into two non-terminals: `program : EOF { return []; } | statements EOF { return $1; }; statements : statement { $$ = [$1]; } | statements statement { $1.push($2); $$ = $1; };` Also, don't use `return` everywhere. Only use `return` in the production rules for `program`. Everywhere else, use `$$ = expr;` instead of `return expr;`. – Aadit M Shah Apr 15 '15 at 16:56
  • @AaditMShah What is the difference? – seequ Apr 15 '15 at 16:57
  • The difference is that in your grammar you defined `program : { return []; } | ...` due to which the first rule (i.e. the empty rule was always matched) which is why the parser always returned `[]`. In mine, the empty rule is never matched. Instead `program : EOF | statements EOF` matches either the end of file or a bunch of statements followed by the end of file. The `statements` production rule matches either one statement or else a bunch of statements followed by another statement. Hence, it doesn't have the empty production rule either. – Aadit M Shah Apr 15 '15 at 17:11
  • 2
    Furthermore, when you `return` from a production rule then parsing stops. Hence, you should only return when you encounter the end of file. Otherwise you should assign to the special variable `$$` instead. – Aadit M Shah Apr 15 '15 at 17:12

2 Answers2

3

As Aadit M. Shah points out in a comment, the problem is that you cannot return in a jison grammar action before the parse is complete. If a parser rule executes a return, the parser itself will return. You need to assign the semantic value to $$.

rici
  • 234,347
  • 28
  • 237
  • 341
2

Try:

%start program

%% /* language grammar */

program
    : EOF
        { return []; }
    | statements EOF
        { return $1; }
    ;
statements
    : statement
        { $$ = [$1]; }
    | statements statement
        { $1.push($2); $$ = $1; }
    ;

Also, replace the returns with "$$ = "

statement
    : assign NL
        { $$ = $1; }
    ;

assign
    : IDENTIFIER ASSIGN expression
        { $$ = ['assign', $1, $3]; }
    | IDENTIFIER REACT expression
        { $$ = ['react', $1, $2, $3]; }
    ;

expression
    : NUMBER
        { $$ = $1; }
    | IDENTIFIER
        {/*add something here like $$ = $1 to keep the original value*/}
    ;
Lucas Farias
  • 329
  • 2
  • 3