1

Why is this positive lookahead not matching the text in bold ? (not . and not ->)

[_a-z0-9]+(?=\.|->)[_a-z0-9]+
hints6.ai_flags = 0; // comment hints.ai_flags
hints6.ai_family = AF_UNSPEC;
int newsocket = socket(result->ai_family, result->ai_socktype, result->ai_protocol);
tchrist
  • 78,834
  • 30
  • 123
  • 180

1 Answers1

5

first, you need to escape the . (. matches 'anything', you want \. to match ".") [weird - someone edited the question to fix that]

second, a lookahead looks ahead - it doesn't consume anything. so, even if you escape the ., [_a-z0-9]+(?=\.|->) is only consuming "hints6" (and then, looking ahead, confirming that the next character will be "."). after that, [_a-z0-9]+ fails to match the ".".

i don't really understand what you are trying to do. as far as i can see, [_a-z0-9]+(?:\.|->)[_a-z0-9]+ would do what you want, without any lookahead ((?:...) is a grouping that doesn't bind to results).

in general, you don't need lookahead much, because you can just match. it's used mainly when you want to match a group, but for some (strange) reason need to check a specific case beforehand.

[edit:] if you just want to capture the two words then use ([_a-z0-9]+)(?:\.|->)([_a-z0-9]+) (note that will capture two groups, one for each word).

[edit2:] if this is for a syntax highlighter then i think you could go with something that highlights either word. it's easier to show than explain:

(?:[_a-z0-9]+(?=(?:\.|->)[_a-z0-9]+)|(?<=[_a-z0-9]+(?:\.|->))[_a-z0-9]+)

that will highlight either "a word followed by .word" or "a word preceded by word.", which will highlight both words without highlighting the dot. they are two separate matches, but the syntax highlighter doesn't care about that (i assume).

andrew cooke
  • 45,717
  • 10
  • 93
  • 143
  • I used `[_a-z0-9]+(?:\.|->)[_a-z0-9]+` but it matches `hints6.ai_flags` when I only want `hints6` and `ai_flags` – Richard Ilos Mar 07 '12 at 12:13
  • Is it possible to do it without grouping ? The regex I am using does not provide a method to call groups; that is why I was trying to use an alternative method. – Richard Ilos Mar 07 '12 at 12:26
  • not that i know of. see lack of replies at http://stackoverflow.com/questions/4002218/regex-exclude-sub-group-text-from-being-included-in-a-parent-sub-group – andrew cooke Mar 07 '12 at 12:29
  • @kikumbob i think you meant to ask RichardIlos – andrew cooke Mar 07 '12 at 13:27
  • 1
    @kikumbob I am editing my syntax-highlighter in Gedit (gtksourceview). Apparently, the config file does not support selecting only groups recursively when highlighting: /usr/share/gtksourceview-2.0/language-specs/c.lang – Richard Ilos Mar 07 '12 at 13:55
  • I posted a question about this yesterday. http://stackoverflow.com/questions/9592805/gedit-syntax-highlighting-with-gtksourceview-for-backreferencing-sub-patterns – Richard Ilos Mar 07 '12 at 14:38
  • that's a different issue. the above (edit 2) will work for the details in this question, but if "hints6" appears alone, elsewhere, then it won't be highlighted. that's just a basic limitation of the highlighting (according to that question's answer). – andrew cooke Mar 07 '12 at 14:48