I am trying to load a simple HTML string, (which regardless of HTML-tidy) will not allow DOMDocument access.
Here is the instantiation
$doc = new DOMDocument(/*'1.0', 'utf-8'*/);
$doc->recover = true;
$doc->strictErrorChecking = false;
$doc->formatOutput = true;
$doc->load($content);
$node_array = $doc->getElementsByTagName("body");
print_r( $node_array)
...or $node_array->items(0);
I get:
DOMNodeList Object
(
)
DOMDocument returns the string just fine with the function save It is not a resource. Could it be missing dependencies, additional PHP configurations...?
Update: The objects of DOMDocument simply don't have any tostring conversion functions implemented:
print_r( (string)$node_array );
Object of class DOMNodeList could not be converted to string in....
The HTML Code is here: http://pastebin.com/11V92Dup (intentionally malformed - this was to demonstrate in the code that 'tidy' properly closes the tags)
I would like to simply walk the nodes and output their content:
$node_array = $doc->getElementsByTagName("html");//parent_node();
$x = $doc->documentElement;
foreach ($x->childNodes AS $item)
{
print $item->nodeName . " = " . $item->nodeValue . "<br />";
}
UPDATE 2: I get this result! which doesn't make sense. (where do all the whitespaces come from?)
body =
COMPOUND: C05441