In flex, I want to return multiple tokens for one match of a regular expression. Is there a way to do this?
Asked
Active
Viewed 3,203 times
3 Answers
3
The way I've been doing this is to create a queue of to-be-returned tokens, and at the beginning of yylex()
, check for tokens and return them.

Zifre
- 26,504
- 11
- 85
- 105
0
Do you mean all matches? Are you using regex functions or string functions? Use the global flag.
As for flex, I don't think you can do that. You test for a match with one pattern at a time so that's probably out of scope. Why'd you want that? As an optimization? Scoping issues?

Lesmana
- 25,663
- 9
- 82
- 87

dirkgently
- 108,024
- 16
- 131
- 187
-
To be honest, I am fairly new to flex and I am not sure. I thought I was using a combination of regular expressions and string matching. Here is an example "(" { return L_PAREN; } {INT} { yylval.Int = atoi(yytext); return INT; } What I want is to be able to return two tokens at once. – Eburetto Feb 22 '09 at 09:44
-1
Usually, this is handled by a parser on top of the scanner which gives you much cleaner code. You can emulate that to some degree with states:
%option noyywrap
%top {
#define TOKEN_LEFT_PAREN 4711
#define TOKEN_RIGHT_PAREN 4712
#define TOKEN_NUMBER 4713
}
%x PAREN_STATE
%%
"(" BEGIN(PAREN_STATE); return TOKEN_LEFT_PAREN;
<PAREN_STATE>{
[0-9]+ return TOKEN_NUMBER;
")" BEGIN(INITIAL); return TOKEN_RIGHT_PAREN;
.|\n /* maybe signal syntax error here */
}
%%
int main (int argc, char *argv [])
{
int i;
while ((i = yylex ()))
printf ("%d\n", i);
return 0;
}
but this will get very messy as soon as your grammar gets more complex.

Tim Landscheidt
- 1,400
- 1
- 15
- 20
-
Fascinating. I'm sure at one point this answer had a score of 10 or more for many years, but in the past, hmmm, six months it has gone negative without any indication why. – Tim Landscheidt Mar 11 '18 at 18:54