14

I need to treate a string in C where certain words, if present, have to be converted to uppercase. My first choice was to work it in LEX something like this:

%%
word1    {setToUppercase(yytext);RETURN WORD1;}
word2    {setToUppercase(yytext);RETURN WORD2;}
word3    {setToUppercase(yytext);RETURN WORD3;}
%%

The problem I see is that I don't get to detect if some of the chars are uppercase (f.e. Word1, wOrd1...). This could mean a one by one listing:

%%
word1   |
Word1   |
WOrd1   
 {setToUppercase(yytext);RETURN WORD1;}

%%

Is there a way of defining that this especific tokens are to be compared in a case insensitive mode? I have found that I can compile the lexer to be case insensitive, but this can affect other pars of my program.

If not, any workaround suggestion?

jordi
  • 1,157
  • 1
  • 13
  • 37

3 Answers3

23

You could set case-insensitivity in the .l file:

%option caseless

You could call flex -i.

Or you could state individual rules to be case-insensitive:

(?i:word)
DevSolar
  • 67,862
  • 21
  • 134
  • 209
  • I have read about that option, but, as I understood that implies that every token in the lexer will be case insensitive, isn't it? – jordi Mar 27 '14 at 12:16
  • 1
    @jordi: See extended answer. Assuming you use `flex`; I have no experience with `lex` and don't know whether it supports this. – DevSolar Mar 27 '14 at 12:16
  • Ups. Sorry. I'll try that. – jordi Mar 27 '14 at 12:17
  • Looks like it is not supported in lex. I found another way. – jordi Mar 27 '14 at 13:03
  • 2
    Regarding **f**lex, `(?i:word)` is documented here: http://flex.sourceforge.net/manual/Patterns.html . You can use after version 2.5.34. – Ciro Costa Aug 03 '15 at 01:45
2

Seems that the way that works is this one:

(W|w)(O|o)(R|r)(D|d) {setToUppercase(yytext);}
jordi
  • 1,157
  • 1
  • 13
  • 37
  • 1
    That's what you get for using ancient tools. ;-) – DevSolar Mar 27 '14 at 13:35
  • 1
    Bear Grylls of programing :) – jordi Mar 27 '14 at 16:00
  • 6
    Back in the day, I used to paste 26 definitions into my lex files: `A [Aa]`, `B [Bb]`, ..., and then you can write `{W}{O}{R}{D}`. Although it's not much shorter than `[Ww][Oo][Rr][Dd]`, it's a bit easier to type. `flex` is better. – rici Mar 27 '14 at 18:57
2

Its very simple give your patterns and actions as it is,don't worry. While compiling give it as, lex -i filename.l This is on LINUX systems.

Suraj MU
  • 21
  • 2