0

So we have a tutorial on flex,bison before we start our complation techniques course at university.

The following test should be split into lines and newlines

testtest test data
second line in the data
another line without a trailing newline

This is what my parser should output:

Line: testtest test data
NL
Line: second line in the data
NL 
Line: another line without a trailing newline

When im running following

cat test.txt | ./parser 

This returns:

LINE: testtest test data
It's a bad: syntax error

This is in my .y file:

 %{
  #include<stdio.h>
  int yylex();            /* Supress C99 warning on OSX */
  extern char *yytext;    /* Correct for Flex */
  unsigned int total;

%}
%token LINE
%token NL
%%
line    : LINE              {printf("LINE: %s\n", yytext);}
        ;
newline : NL                {printf("NL\n");}
        ;

And this is in my binary.flex file:

    %top{
#define YYSTYPE int
#include "binary.tab.h"         /* Token values generated by bison */
}
%option noyywrap
%%
[^\n\r/]+   return LINE; 
\n          return NL;      
%%

So, any ideas to solve this problem ?

PS: This is my .c file

#include<stdio.h>
#include "binary.tab.h"
extern unsigned int total;

int yyerror(char *c)
{
  printf("It's a bad: %s\n", c);
  return 0;
}

int main(int argc, char **argv)
{
  if(!yyparse())
    printf("It's a mario time: %d\n",total);
  return 0;
}
rici
  • 234,347
  • 28
  • 237
  • 341

1 Answers1

1

Your bison grammar recognizes precisely one LINE (without a newline) because the bison grammar recognizes the first non-terminal. Just that, and no more.

If you want to recognize multiples lines, each consisting of a LINE and possibly a NL, you'll need to add a definition for an input consisting of multiple lines, each consisting of ... . I'm not sure why you would use bison for this, though, since the original problem seems easy to solve with just flex.

By the way, if your input file includes a \r character, none of your flex patterns will recognize it (the flex-generated default rule will catch it, but that is almost never what you want). Use %option nodefault so that you get a warning about this sort of error. And react when you see warnings: you will have seen several when you ran bison on your bison file, I'm sure.

rici
  • 234,347
  • 28
  • 237
  • 341
  • Yeah I know, first task was to do it just using flex, but second was emitting all output/input in flex and put it in bison, so thats why im here. As I just ran the provided make file, i did not get any warnings. But ran the bison now, and it gives me errors. Will check back after i tried to deal with them. How do i extend my flex patterns so it catches line break aswell? Just make another rule? – adam björkman Jan 16 '17 at 06:42
  • @adam: your flex patterns are fine, except for the \r thing, which is not critical and easy to fix. Your problem with bison is that you need to define your entire input; bison is intended to parse programs. If you just want to split the inout into pieces, you don't need it; flex already does that. – rici Jan 16 '17 at 07:05
  • Hmm, okey. I maybe misunderstood the asignment: Remove any printing/output from the lexer and implement a simple grammar that recognises sequences of lines and newlines, outputing the text of each line from actions in the grammar. The output should look the same as exercise 1, but should come from actions in the bison grammar rather than the flex grammar. Ill try and will be back! – adam björkman Jan 16 '17 at 08:27
  • @adambjörkman: It wouldn't have been my first choice as an assignment, but I guess I see where your professor is coming from. The key is the phrase "recognises sequences". Where in your attempt to define a grammar is a "sequence"? What do you need to write to describe (and therefore recognize) a sequence? – rici Jan 16 '17 at 13:55
  • Finally did it. I totally get what you was saying now, with me only recongnising one line. I did a `start : textline; | start textline; ;` which works wonders. Now I only need to: Implement a flex definition with tokens for: pipes, semicolon, newlines and text. Output the tokens and their values in the bison grammar. Thanks! :) – adam björkman Jan 16 '17 at 17:00