0

I'm trying to build an AST for a simple programming language (homework). However I can't make it to work : it seems that intermediate values ($1, $2, ...) are invalid and doesn't correspond to what I return in "sub-expressions".

Here is the Bison code of my project (I think the problem is here and not in my AST functions) : I've put comments where I encounter invalid values. It's my first project using Bison so I'm not sure I'm doing things correctly.

I also use Flex but the flex code seems to work correctly.

Thanks.

%{
#include <stdio.h>

#include "node.h"
#include "print_node.h"

int yylex();
int yyerror(char * s);

CommandNode * root = NULL;
%}

%union
{
    struct ExpressionNode * expression;
    struct CommandNode    * command;
    int    number;
    char * var;
}

%type   <expression>    E T F
%type   <command>       C

%token  <number>        NUMBER
%token  <var>           VAR

%token                  AF SKIP SEQ IF THEN ELSE WHILE DO ADD SUB MUL EOL

%%

root:           C EOL      { root = $1; return 0; /************ $1 seems to be garbage ************/ }
                ;

E:              E ADD T    { $$ = newAddNode($1,$3); }
        |       E SUB T    { $$ = newSubNode($1,$3); }
        |       T          { $$ = $1;                }
        ;

T:              T MUL F    { $$ = newMulNode($1,$3); }
        |       F          { $$ = $1;                }
        ;

F:              '(' E ')'  { $$ = $2;                }
        |       NUMBER     { $$ = newNumberNode($1); }
        |       VAR        { $$ = newVarNode($1);    }
        ;

C:              SKIP                 { $$ = newSkipNode();       }
        |       VAR AF E             { $$ = newAfNode($1,$3);    }
        |       '(' C ')'            { $$ = $2;                  }
        |       IF E THEN C ELSE C   { $$ = newIfNode($2,$4,$6); }
        |       WHILE E DO C         { $$ = newWhileNode($2,$4); }
        |       C SEQ C              { $$ = newSeqNode($1,$3); /************ $1 and $3 seems to be garbage ************/ }
        ;

%%

int main()
{
    yyparse();
}

int yyerror(char * s)
{
    fprintf(stderr, "yyerror: %s\n", s);
}
  • "(I think the problem is here and not in my AST functions)" I'm not so sure about that. Can you post them, too? – sepp2k Mar 10 '17 at 19:25
  • 1
    What you pasted looks fine to me. Do your node creation functions malloc memory or do they just return the address of a local variable? If the second, that will be the problem. If it's the first, you will eventually need to insert code to free the allocated memory. As always, you can help get a good answer by providing a [mcve]. – rici Mar 10 '17 at 19:30
  • 1
    I removed the flex tag since you say it is nit relevant to your problem. If you have a good reason to put it back or have another question in the future, the correct tag for the flex lexical scanner generator is [tag:flex-lexer]; [tag:flex] is an embedded language now part of the Apache project. – rici Mar 10 '17 at 19:35
  • I will re-add the flex-lexer tag since my bug was related to it. Thank you all for your quick response. – Timothée Jourde Mar 18 '17 at 11:51

1 Answers1

0

Most commonly, the symptoms you describe happen because your lexer (flex code, which you don't show) returns yytext directly. Since yytext points at the scanner's internal buffer, it looks fine at that instance, but after the next token(s) are read, its value mysteriously changes. This will happen if you have a flex rule like:

[a-zA-A][a-zA-Z0-9]*    { yylval.var = yytext; return VAR; }

to fix it, you need to make a copy of yytext before returning it to your parser. Something like

[a-zA-A][a-zA-Z0-9]*    { yylval.var = strdup(yytext); return VAR; }

will do the trick, though it exposes you to memory leaks.

Chris Dodd
  • 119,907
  • 13
  • 134
  • 226
  • Thank you ! I was doing a copy of the string in my ast functions but obviously it was too late, my bug is now resolved (however I'm not sure it was the only issue as I corrected a few other mistakes too..). – Timothée Jourde Mar 18 '17 at 11:47