I need to extract all string literals from a given C# file. All conditional compilation constants (e.g. #if DEBUG
) are assumed to be false, and the file can be assumed to be syntactically correct. Both single-line ("a\u1000b"
) and verbatim (@"x""\y"
) literals should be supported.
First I tried to use regular expressions, but then realized that I need to correctly handle single- and multi-line comments and logical expressions in #if
directives.
So, before I started to write my own C# lexer, I would like to ask you about existing solutions.