Suppos that within a regex, if match one alternative from an alternation it stop right there even if still more alternatives left (there are no other tokens in the regex outside the alternation).
This pattern that search one double word (e.g., this this
)
\b([a-z]+)((?:\s|<[^>]+>)+)(\1\b)
I have one confusion if I introduce this subject:
It match with the patern.
"<i>whatever<i> whatever"
\b([a-z]+)
Match
((?:<[^>]+>|\s)+)
Follows one TAG, so the 2nd alternative match.
(\1\b)
Have to match if follows the same word backreferenced in the first parentheses.
Why match if after the tag not follows the '(\1\b)
', follows whitespaces.
I know that within the alternation exist \s
.
But is not supposed that the TAG match consume the alternation?
Why the \s
alternative still alive?