0

I’m doing an html and bbcode parser. I have this regex that catches repeated groups from the inside-out:

$re = '/<b>((?:(?!<\/?b>).)*)<\/b>/is';

But I want to be able to match <b class=”string”>text</b> or any other attribute inside. I was already doing it by using:

'/<b((\s)+?.*?)?\>(.*?)<\/b>/is'

But now when trying to join them and add the new negative lookahead, I cannot make it work.

I tried '/<b((\s)+?.*?)?\((?:(?!</?b((\s)+?.*?)?>).)*)<\/b>/is' but does not work properly, for this:

<b class=”string2”><b class=”string”>text</b></b>

It matches from the first b tag, and it shouldn’t. I would like to get:

<b class=”string2”>[b]text[/b]</b>
Vixxs
  • 569
  • 7
  • 21

1 Answers1

1

This will all < b > with [b]:

<?php

$str = '<b>test</b><b class=”string2”><b class=”string”>text</b></b>';
$prev = '';
while ($prev != $str) {
    $prev = $str;
    $str = preg_replace("/<b[ a-z0-9\"'\=”]*?>(.*?)<\/b>/is","[b]$1[/b]",$str);
}
echo $str;

?>
Neil
  • 14,063
  • 3
  • 30
  • 51
  • First problem I see is it does not work for without attributes, second problem is this matches
    , wich can be fixed adding but the first problem still exist.
    – Vixxs Mar 10 '17 at 18:40
  • But now with (.*?) again in the middle it does not parse HTML as it should, instead of matching HTML tags from the inside-out, it matches the first open tag and first closing tag even with more open tags inside, and that's not how HTML works. That is the trickie part. – Vixxs Mar 10 '17 at 19:20