6

My apologies if the title of this thread is a little confusing. What I'm asking about is how does Flex (the lexical analyzer) handle issues of precedence?

For example, let's say I have two tokens with similar regular expressions, written in the following order:

"//"[!\/]{1}    return FIRST;
"//"[!\/]{1}\<  return SECOND;

Given the input "//!<", will FIRST or SECOND be returned? Or both?

The FIRST string would be reached before the SECOND string, but it seems that returning SECOND would be the right behavior.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Casey Patton
  • 4,021
  • 9
  • 41
  • 54

1 Answers1

12

The longest match is returned.

From flex & bison, Text Processing Tools:

How Flex Handles Ambiguous Patterns

Most flex programs are quite ambiguous, with multiple patterns that can match the same input. Flex resolves the ambiguity with two simple rules:

  • Match the longest possible string every time the scanner matches input.
  • In the case of a tie, use the pattern that appears first in the program.

You can test this yourself, of course:

file: demo.l

%%
"//"[!/]   {printf("FIRST");}
"//"[!/]<  {printf("SECOND");}
%%

int main(int argc, char **argv)
{
    while(yylex() != 0);
    return 0;
}

Note that / and < don't need escaping, and {1} is redundant.

bart@hades:~/Programming/GNU-Flex-Bison/demo$ flex demo.l 
bart@hades:~/Programming/GNU-Flex-Bison/demo$ cc lex.yy.c  -lfl
bart@hades:~/Programming/GNU-Flex-Bison/demo$ ./a.out < in.txt 
SECOND

where in.txt contains //!<.

Bart Kiers
  • 166,582
  • 36
  • 299
  • 288
  • I had used {1} hoping that it would match strings where where ! or / occurred ONLY 1 time. I got the impression it would work that way from this website: http://www.regular-expressions.info/reference.html where it says "Repeats the previous item exactly n times." – Casey Patton Jul 18 '11 at 17:33
  • @Casey, correct, `a{1}` will match an `a` exactly once, as does the pattern `a`. So you can put `{1}` after it, but it only adds noise to the regex. – Bart Kiers Jul 18 '11 at 17:41
  • @Casey, see my revised answer. – Bart Kiers Jul 18 '11 at 19:06