0

I am trying to write lexer with racket , and I am using parser-tools/lex and parser-tools/lex-sre. and I would like to create Token for strings - but Because the lexer selection is not greedy if I have :

"this is" .... "cool" 

it will be one token instead of

StringToken ,Tokens....,StringToken . 

How can I fix it and make it lazy/greedy in the selection ? Until now I have that :

       [
            (: 
            #\" 
            (repetition  
                0
                +inf.0
                (complement 
                (or
                    #\newline 
                    whitespace
                
                )

                ) 
            )   
            #\") 
            (begin (token-STRING  lexeme  ))
        ]

But it doesn't do the job so good Like I said .

Thanks Idan.M .

IDANG
  • 11
  • 2

2 Answers2

0

I found a little trick that did the job - I don't so like it (cause it is limiting the the language a bit) - but until anyone answers something better:

[
    (: 
    #\" 

    (repetition  
        0
        +inf.0
        (~ 
            #\"
        ) 
    )    
    #\") 
    (begin (token-STRING (substring lexeme 1 (sub1 (string-length lexeme)) )))
]
user16217248
  • 3,119
  • 19
  • 19
  • 37
IDANG
  • 11
  • 2
0

The package brag has from/to which precisely does what you want: non-greedy matching from from to to.

So either install and use this package, or take a look at the source code for how it works. AFAICS, their solution looks kinda hacky as well (comparable to yours), but it's encapsulated so that you need to do this hacky bit just once when you define the abstraction.

Sorawee Porncharoenwase
  • 6,305
  • 1
  • 14
  • 28