0

I am new to flex and bison. I am trying to write a simple grammar accepting the string :a word in lowercase followed by a word in upper case. below are my files-

file.l

%{
#include<stdio.h>
#include<string.h>
#include "y.tab.h"

int yywrap(void)
{
    printf("parsing is done*\n");   
    //yylex();
    //return 0;
}
%}

%%
[a-z]* { printf("found lower\n");
    yylval=yytext;
    return LOWER;
}
[A-Z]* { printf("found upper\n");
    yylval=yytext;
    return UPPER;
}

[ \n] ;
. ;
%%
void main()
{


        yyin = fopen("file.txt", "r");
        yylex();//this function will start the rules section.... it starts the parsing.....
        fclose(yyin);

}//main ends

file.y

%{
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
#define YYSTYPE char *
int yylex(void);
void yyerror(const char *str)
{
        fprintf(stderr,"error: %s\n",str);
}
%}

%token LOWER UPPER

%%
start   :
    |
    start LOWER UPPER
    {
        printf("%s--%s\n",$2,$3);

    }
%%

contents of file.txt is:

token TOKEN

this is how i compile and run:

flex file.l

yacc -d file.y

gcc lex.yy.c y.tab.c -o file

./file

The program gives warning warning: assignment makes integer from pointer without a cast [-Wint-conversion] yylval=yytext;

When I run the program (ignoring warning), the output is "found lower" i.e the program stops reading tokens after return LOWER. Can anyone help and tell me why is this running like this?Also why is the warning generated even though i specified #define YYSTYPE char * in file.y

melpomene
  • 84,125
  • 8
  • 85
  • 148
RaKo
  • 49
  • 1
  • 2
  • 11

1 Answers1

2

1. Why is the warning generated even though I specified #define YYSTYPE char * in file.y?

Because that define is not visible in file.l. Both files must have consistent definitions ofyytext.

Also, you should be aware that it is never correct to simply set yylval = yytext because the buffer into which yytext points is part of a private data structure of the lexical scanner. If you need to pass the token's string value to the parser, you must make a copy.

2. Why does main not read the whole file?

Because you are never actually calling the parser, whose name is yyparse. If you are using a standard bison parser, you should never call yylex directly; yylex is called by the parser when it needs a token. [Note 1]

Since yylex just returns a single token, calling it once will produce one token. You can call it in a loop, as suggested in a comment, but that will still not parse the file.


Notes

  1. Bison can generate "push-parsers" which are called by the lexer when it has an available token. In that case, the lexer actions would not return until the entire input has been parsed, and you would call yylex rather than yyparse. That can simplify the parsing of certain languages, but it is certainly not the case here.
Community
  • 1
  • 1
rici
  • 234,347
  • 28
  • 237
  • 341
  • I got the solution. I changed the code in main function of file.l to yyin = fopen("file.txt", "r"); while(yylex()); fclose(yyin); Now whole file is parsed. But output is "found lower found upper parsing is done******" This time no output from file.y is produced. Does this mean that only tokenization is done and no parsing is done?How to do the parsing. – RaKo Feb 05 '17 at 21:07
  • @RaKo: OK, now I've answered both your questions. Please read both suggestions and make sure you fix the issues with `yylval`; otherwise, you will end up with very mysterious-looking tokens in your parser. – rici Feb 06 '17 at 04:05
  • part 2 explanation is great. I have got the solution. solution for part 1)#define YYSTYPE char * in the definition section of both file.l and file.y 2)the main function in file.l will be yyin = fopen("file.txt", "r");yyparse(); fclose(yyin); Thanks for your help, I could not get this simple thing from anywhere else. – RaKo Feb 06 '17 at 07:18