Using this code as an example to test:
$dom = new DOMDocument();
$dom->preserveWhiteSpace = TRUE;
@ $dom->loadHTML($html);
$mydocnodes = $dom->getElementsByTagName('*');
foreach($mydocnodes as $node) {
$title_text = $node->textContent;
$tag_text = $node->tagName;
print $title_text . " in a " . $tag_text . " and my next sibling is " . $node->nextSibling->tagName . "</br>";
}
When the HTML for this is all on one line such as:
<html><body><h1>hello</h1><p>I am text</p<p>I am text</p></body></html>
The nextSibling works fine. However when the html is formatted as below it does not work and the values are null. It appears as though a sibling has to be on the exact same line, not just at the same level in the DOM.
<html>
<body>
<h1>hello</h1>
<p>I am text</p>
<p>I am text</p>
</body>
</html>
Given most HTML is formatted across multiple lines, how can I load my HTML into the DomDocument so as to have the next and previous siblings work?
Many thanks!