Page 78 of the Flex user's manual says:
There is no way to write a rule which is "match this text, but only if it comes at the end of the file". You can fake it, though, if you happen to have a character lying around that you don't allow in your input. Then you can redefine YY_INPUT to call your own routine which, if it sees an EOF, returns the magic character first (and remembers to return a real EOF next time it's called.
I am trying to implement that approach. In fact I managed to get it working (see below). For this input:
Hello, world
How are you?@
I get this (correct) output:
Here's some text Hello, world
Saw this string at EOF How are you?
But I had to do two things in my implementation to get it to work; two things that I shouldn't have to do:
I had to call yyterminate()
. If I don't call yyterminate()
then the output is this:
Here's some text Hello, world
Saw this string at EOF How are you?
Saw this string at EOF
I shouldn't be getting that last line. Why am I getting that last line?
I don't understand why I had to do this: tmp[yyleng-1] = '\0';
(subtract 1). I should be able to do this: tmp[yyleng] = '\0';
(not subtract 1) Why do I need to subtract 1?
%option noyywrap
%{
int sawEOF = 0;
#define YY_INPUT(buf,result,max_size) \
{ \
if (sawEOF == 1) \
result = YY_NULL; \
else { \
int c = fgetc(yyin); \
if (c == EOF) { \
sawEOF = 1; \
buf[0] = '@'; \
result = 1; \
} \
else { \
buf[0] = c; \
result = 1; \
} \
} \
}
%}
EOF_CHAR @
%%
[^\n@]*{EOF_CHAR} { char *tmp = strdup(yytext);
tmp[yyleng-1] = '\0';
printf("Saw this string at EOF %s\n", tmp);
yyterminate();
}
[^\n@]+ { printf("Here's some text %s\n", yytext); }
\n { }
%%
int main(int argc, char *argv[])
{
yyin = fopen(argv[1], "r");
yylex();
fclose(yyin);
return 0;
}