7

In flex, I want to return multiple tokens for one match of a regular expression. Is there a way to do this?

Cœur
  • 37,241
  • 25
  • 195
  • 267
Eburetto
  • 213
  • 2
  • 9

3 Answers3

3

The way I've been doing this is to create a queue of to-be-returned tokens, and at the beginning of yylex(), check for tokens and return them.

Zifre
  • 26,504
  • 11
  • 85
  • 105
0

Do you mean all matches? Are you using regex functions or string functions? Use the global flag.

As for flex, I don't think you can do that. You test for a match with one pattern at a time so that's probably out of scope. Why'd you want that? As an optimization? Scoping issues?

Lesmana
  • 25,663
  • 9
  • 82
  • 87
dirkgently
  • 108,024
  • 16
  • 131
  • 187
  • To be honest, I am fairly new to flex and I am not sure. I thought I was using a combination of regular expressions and string matching. Here is an example "(" { return L_PAREN; } {INT} { yylval.Int = atoi(yytext); return INT; } What I want is to be able to return two tokens at once. – Eburetto Feb 22 '09 at 09:44
-1

Usually, this is handled by a parser on top of the scanner which gives you much cleaner code. You can emulate that to some degree with states:

%option noyywrap

%top {
#define TOKEN_LEFT_PAREN    4711
#define TOKEN_RIGHT_PAREN   4712
#define TOKEN_NUMBER        4713
}

%x PAREN_STATE
%%
"("         BEGIN(PAREN_STATE); return TOKEN_LEFT_PAREN;
<PAREN_STATE>{
   [0-9]+   return TOKEN_NUMBER;
   ")"      BEGIN(INITIAL); return TOKEN_RIGHT_PAREN;
   .|\n     /* maybe signal syntax error here */
}
%%
int main (int argc, char *argv [])
{
  int i;

  while ((i = yylex ()))
    printf ("%d\n", i);

  return 0;
}

but this will get very messy as soon as your grammar gets more complex.

Tim Landscheidt
  • 1,400
  • 1
  • 15
  • 20
  • Fascinating. I'm sure at one point this answer had a score of 10 or more for many years, but in the past, hmmm, six months it has gone negative without any indication why. – Tim Landscheidt Mar 11 '18 at 18:54