I have a string $xml
that may or may not have a DOCTYPE tag. I have a custom DOCTYPE tag with entities that I want to add to $xml
if it does not already have a DOCTYPE. I'm going to create a DOMDocument with $document = new \DOMDocument(); $document->loadXML($xml);
.
How can I efficiently determine whether or not $xml
has a doctype and add my custom DOCTYPE tag if it doesn't? createDocumentType()
does not offer a means to add entities or notations.
Preference given to using the DOM model over doing a pattern match on $xml
.
UPDATE: Based on the comment about modifying the incoming XML, here's a code sample that demonstrates the situation:
\libxml_use_internal_errors(true);
\libxml_clear_errors();
$document = new \DOMDocument();
$document->xmlVersion = '1.0';
$document->encoding = 'UTF-8';
$doctype = <<<'XML'
<!DOCTYPE root [
<!ENTITY quot """>
<!ENTITY amp "&">
<!ENTITY nbsp " ">
]>
XML;
$xml = '<a> </a>';
$document->loadXML($xml);
if (\is_null($document->doctype)) {
$document = new \DOMDocument();
$document->xmlVersion = '1.0';
$document->encoding = 'UTF-8';
$document->loadXML($doctype.$xml);
echo $doctype.$xml."\n";
}
foreach (\libxml_get_errors() as $error) {
// make it pretty and echo it
}
Here's the output:
<!DOCTYPE root [
<!ENTITY quot """>
<!ENTITY amp "&">
<!ENTITY nbsp " ">
]>
<a> </a>
Fatal Error 26: Entity 'nbsp' not defined
FYI, the answer is not "it looks like you are working with HTML, use loadHTML() instead of loadXML()." The code in question works with both HTML snippets and complete documents. This is also about being able to specify custom doctypes as the code in question may handle other doctypes or more general XML cases in the future.