I have an html string that is formatted correctly for html.
After I load a PHP DOMDocument object with it I read the node tree and it is wrong.
The node tree does not match the html.
The table node is inside a #text node.
The 2nd td node is inside the first td node.
The 2nd tr node is inside the first tr node.
The 4th td node is inside the 3rd td node.
The #text 'after' is inside the table node.
Why is this wrong and how can I fix it?
The code below is executed here:
https://dev.aecperformance.com/test.php
//Formatted so you can easily see the format
$html = "<body>
<div style='border:1px solid blue; padding:10px;' >
This is a <b>bold <span style='color:red'>red test</span></b> a table
<table style='display:inline-block; border:1px solid green; padding:0'>
<tr><td>Head 1</td><td>Head 2</td></tr>
<tr><td>Value 1</td><td>Value 2</td></tr>
</table>
after
</div>
After div
</body>";
//Formatted with all tabs and line feeds stripped
$html = "<body><div style='border:1px solid blue; padding:10px;' >This is a <b>bold <span style='color:red'>red test</span></b> a table<table style='display:inline-block; border:1px solid green; padding:0'><tr><td>Head 1</td><td>Head 2</td></tr><tr><td>Value 1</td><td>Value 2</td></tr></table>after</div>After div</body>";
$nNxtLvl = 0;
function processChildNodes($node)
{
global $nNxtLvl;
$lvl = $nNxtLvl;
for($i=0; $i < $lvl; $i++) {
echo " ";
}
echo $node->nodeName;
if($node->nodeName == "#text") echo " " . $node->nodeValue;
echo "<br>";
$cNodes = $node->childNodes;
if (!empty($cNodes)) {
foreach ($cNodes as $cNode) {
$nNxtLvl++;
processChildNodes($cNode);
}
}
$nNxtLvl = $lvl;
}
$dom = new \DOMDocument();
$dom->loadHTML($html);
$ls = $dom->getElementsByTagName('body');
$elBody = $ls[0];
$ls = $elBody->childNodes;
for($i=0; $i < count($ls); $i++) {
processChildNodes($ls->item($i));
}