4

I've got a short question regarding my grammar in bison. The files compile, but it does not give the result I actually wanted. ;-)

My example file that I want to parse:

L1: ldh [23]

The lexer file looks like this:

...

digit           [0-9]
digit_s         [1-9]
digit_n         [0]
hex             [a-fA-F0-9]
hex_x           [x]
number_dec      {digit_n}|{digit_s}{digit}*
number_hex      {digit_n}{hex_x}{hex}+
label_s         [a-zA-Z]
label_me        [a-zA-Z0-9_]+
label           {label_s}{label_me}+

%%

"ldb"           { return OP_LDB; }
"ldh"           { return OP_LDH; }
...

{number_hex}    { yylval.number = strtoul(yytext, NULL, 16);
                  return number_hex; }

{number_dec}    { yylval.number = strtoul(yytext, NULL, 10);
                  return number_dec; }

{label}         { yylval.label = xstrdup(yytext);
                  return label; }

The bison file like this:

...
%}

%union {
    int number;
    char *label;
}

%token OP_LDB OP_LDH ...
%token number_hex number_dec label
%type <number> number_hex number_dec number do_ldb
%type <label> label do_label

%%

prog
    : {}
    | prog line { }
    ;

line
    : instr { }
    | labeled_instr { }
    ;

labeled_instr
    : do_label instr { }
    ;

instr
    : do_ldb { }
    | do_ldh { }
    ...
    ;

number
    : number_dec { $$ = $1; }
    | number_hex { $$ = $1; }
    ;

do_label
    : label ':' { info("got:%s\n", $1); }
    ;

do_ldb
    : OP_LDB '[' 'x' '+' number ']' { info("got:%d\n", $5); }
    | OP_LDB '[' number ']' { info("got:%d\n", $3); }
    ;

Now my program tells my the following:

Syntax error at line 1: ldh! syntax error, unexpected OP_LDH, expecting ':'!

Do you have any idea what I did wrong?

Big thanks!

johnloom
  • 63
  • 3
  • How did you define do_label and do_ldh? – Omri Barel Aug 13 '11 at 13:29
  • 2
    Does your lexer generate the ':' token? – Omri Barel Aug 13 '11 at 13:49
  • You probably don't want both the plus '+' after the definition of `label_me` and after the use of `{label_me}`. At least, it unnecessarily complicates things, even if the result is the same. – Jonathan Leffler Aug 13 '11 at 19:07
  • also, by using `+` instead of `*` here, you can't have single character labels, which might not be what you intend. – Chris Dodd Aug 13 '11 at 19:14
  • Thanks to all, the missing rule solved it. My first thought was probably that if I enter the token directly like '+', then I would not need a lexer rule for this, but obviously I was wrong. Also thanks for finding the '+' in the label,I oversaw this. :-) –  Aug 14 '11 at 10:23

1 Answers1

2

You're probably missing the rule

":"    { return ':'; }

or something equivalent in your lexer

If you are using flex, you probably want to give it the --nodefault option to ensure that you don't miss some input tokens. Alternately, you can stick %option nodefault in the first section.

Chris Dodd
  • 119,907
  • 13
  • 134
  • 226