I am trying to use REGEX to split a string apart while maintaining the delimeters. I wish to split a very large and unpredictable string apart via anchor tags. I am using HTML tidy to ensure the tags are correct, however anything could come before or after the anchor tag I wish to match.
*PRECEDING-ANYTHING*<a *ANYTHING*>*ANYTHING*</a>*PROCEDING-ANYTHING*
*PRECEDING-ANYTHING*<a *ANYTHING*>*ANYTHING*</a>*PROCEDING-ANYTHING*
where the href URL could be anything and additional attributes such as 'target' could also be anything.
I've done a lot of searching and testing and either I am doing something wrong or the other answers on Stack Overflow do not apply.
Using
$parts= preg_split($pattern, $textWithAnchors, -1, PREG_SPLIT_DELIM_CAPTURE)
I was hoping to have $parts be similar to the following.
parts[0] is equal to *PRECEDING-ANYTHING*
parts[1] is equal to <a *ANYTHING*>*ANYTHING*</a>
and so forth
It is important that the regular expression capture the entire anchor tags and everything inside.
I would very much appreciate any help, I'm asking specifically for a regular expression that will accomplish this in PHP. I am aware that there are HTML parsers however, using REGEX is optimal in this situation. Maybe it will be a learning experiance though.