4

I need to match C++ preprocessor statements. Now, preprocessor statements may span multiple lines:

#define foobar \
    "something glorious"

This final backslash may be escaped so the following results in two separate lines:

#define foobar \\
No longer in preprocessor.

The question is how I can match the explicit line continuation efficiently. I have the following expression which I think works. Basically, it tests whether the number of backslashes is odd. Is this correct? Can it be done more efficiently?

/
    [^\\]           # Something that's not an escape character, followed by …
    (?<escape>\\*?) # … any number of escapes, …
    (?P=escape)     # … twice (i.e. an even number).
    \\ \n           # Finally, a backslash and newline.
/x

(I'm using PHP so PCRE rules apply but I'd appreciate answers in any Regex vernacular.)

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • Do you want to match the entire preprocessor statement or just the explicit line continuation? – Tomalak May 03 '09 at 13:31
  • Just the explicit line continuation. Reason is complicated; basically, I’m using a state machine that uses regular expressions for its transitions and I’m already inside the state representing the preprocessing instruction. I now need to prevent the machine from leaving the state prematurely at the end of the line by consuming the explicit line continuation. I need the regular expression to consume it. – Konrad Rudolph May 03 '09 at 13:41

1 Answers1

6

I think you're making it more difficult than it needs to be. Try this:

/
  (?<!\\)    # not preceded by a backslash
  (?:\\\\)*  # zero or more escaped backslashes
  \\ \n      # single backslash and linefeed
/x
Alan Moore
  • 73,866
  • 12
  • 100
  • 156