flex
uses (approximately) the POSIX "Extended Regular Expression" syntax -- \s
doesn't work, because it's a Perl extension.
Is [ \t\t\r]+
a typo? I think you'll want a \n
in there.
Something like [ \n\t\r]+
certainly should work. For example, this lexer (which I've saved as lexer.l
):
%{
#include <stdio.h>
%}
%option noyywrap
%%
[ \n\t\r]+ { printf("Whitespace: '%s'\n", yytext); }
[^ \n\t\r]+ { printf("Non-whitespace: '%s'\n", yytext); }
%%
int main(void)
{
yylex();
return 0;
}
...successfully matches the whitespace in your example input (which I've saved as input.txt
):
$ flex lexer.l
$ gcc -o test lex.yy.c
$ ./test < input.txt
Non-whitespace: 'program'
Whitespace: '
'
Non-whitespace: '3.3'
Whitespace: ' '
Non-whitespace: '5'
Whitespace: ' '
Non-whitespace: '7'
Whitespace: '
'
Non-whitespace: '{'
Whitespace: ' '
Non-whitespace: 'comment'
Whitespace: ' '
Non-whitespace: '}'
Whitespace: '
'
Non-whitespace: 'string'
Whitespace: '
'
Non-whitespace: 'panic:'
Whitespace: ' '
Non-whitespace: 'cant'
Whitespace: ' '
Non-whitespace: 'happen'
Whitespace: '
'