5

I have a lexical analyser written in flex that passes tokens to my parser written in bison.

The following is a small part of my lexer:

ID [a-z][a-z0-9]*

%%

rule {
    printf("A rule: %s\n", yytext);
    return RULE;
}

{ID} { 
    printf( "An identifier: %s\n", yytext );
    return ID;
}

"(" return LEFT;
")" return RIGHT;

There are other bits for parsing whitespace etc too.

Then part of the parser looks like this:

%{
#include <stdio.h>
#include <stdlib.h>
#define YYSTYPE char*
%}

%token ID RULE 
%token LEFT RIGHT 

%%

rule_decl : 
    RULE LEFT ID RIGHT { printf("Parsing a rule, its identifier is: %s\n", $2); }
    ;

%%

It's all working fine but I just want to print out the ID token using printf - that's all :). I'm not writing a compiler.. it's just that flex/bison are good tools for my software. How are you meant to print tokens? I just get (null) when I print.

Thank you.

ale
  • 11,636
  • 27
  • 92
  • 149
  • 3
    `$2` would be the `LEFT` token in that rule, wouldn't it? Wouldn't `$3` be the `ID` token you want to print out? – Chris Lutz Jul 05 '11 at 20:44
  • I thought $$ would be RULE, $1 is LEFT, $2 is ID and $3 is RIGHT? No? I'm probably wrong :s. Anyhow.. printing out any of them ($$, $1, ...) all result in (null) so I'm doing something else wrong hmm. – ale Jul 05 '11 at 20:47
  • 1
    @alemaster: @Chris Lutz is right; the tokens in the rule are numbered starting at 1. `$$` is a variable to which you can assign a pointer; that pointer will then be taken as the "result" of the rule and will be passed to other rules involving `rule_decl`. – Aasmund Eldhuset Jul 05 '11 at 20:52
  • Thank guys. +1 to Chris. Any ideas how to just print these things? – ale Jul 05 '11 at 20:54

1 Answers1

7

I'm not an expert at yacc, but the way I've been handling the transition from the lexer to the parser is as follows: for each lexer token, you should have a separate rule to "translate" the yytext into a suitable form for your parser. In your case, you are probably just interested in yytext itself (while if you were writing a compiler, you'd wrap it in a SyntaxNode object or something like that). Try

%token ID RULE 
%token LEFT RIGHT

%%

rule_decl:
    RULE LEFT id RIGHT { printf("%s\n", $3); }

id:
    ID { $$ = strdup(yytext); }

The point is that the last rule makes yytext available as a $ variable that can be referenced by rules involving id.

Aasmund Eldhuset
  • 37,289
  • 4
  • 68
  • 81
  • Thank you Aasmund.. just trying to figure out why "error: ‘yytext’ undeclared (first use in this function)" is happening but I'm probably missing something. – ale Jul 05 '11 at 21:08
  • 1
    @alemaster: Use `extern char yytext[];` in the topmost section, or possibly `extern char * yytext;`. – Aasmund Eldhuset Jul 05 '11 at 21:19
  • Fantastic thank you. FYI: Your second option worked but the first just made nothing print (rather than (null)). Thank you :). – ale Jul 05 '11 at 21:27
  • 1
    @alemaster: Good. I seem to recall having been bitten by different YACC implementations have different conventions for whether to use `char[]` or `char*`; that's why I provided both. There might also be an option for controlling the type, but I don't remember what it is. – Aasmund Eldhuset Jul 05 '11 at 21:57