Much of the reason behind your difficulty is simply that HTML is not a regular language, see: Coding Horror: Parsing Html the Cthulhu Way
Consider using a query expression language powerful enough to process (X)HTML, or just using the DOM programmatically to fetch all image tags and then exclude those with <a>
ancestors.
In PHP5, I believe you can use DOMXPath
, using that it becomes as simple as:
$generated_string = '<a href="index.html"><img src="images/inside_a.jpg" /></a>' .
'<div><img src="images/inside_div.jpg" /></div>' .
'<img src="images/inside_nothing.jpg" />';
$doc = new DOMDocument();
$doc->loadHTML($generated_string);
$xpath = new DOMXpath($doc);
$elements = $xpath->query("//*[not(self::a)]/img");
foreach ($elements as $element){
echo $doc->saveXML($element) . "\n";
}
This code would give the output:
<img src="images/inside_div.jpg"/>
<img src="images/inside_nothing.jpg"/>