0

I have a string $xml that may or may not have a DOCTYPE tag. I have a custom DOCTYPE tag with entities that I want to add to $xml if it does not already have a DOCTYPE. I'm going to create a DOMDocument with $document = new \DOMDocument(); $document->loadXML($xml);.

How can I efficiently determine whether or not $xml has a doctype and add my custom DOCTYPE tag if it doesn't? createDocumentType() does not offer a means to add entities or notations.

Preference given to using the DOM model over doing a pattern match on $xml.

UPDATE: Based on the comment about modifying the incoming XML, here's a code sample that demonstrates the situation:

\libxml_use_internal_errors(true);
\libxml_clear_errors();

$document = new \DOMDocument();
$document->xmlVersion = '1.0';
$document->encoding = 'UTF-8';

$doctype = <<<'XML'
<!DOCTYPE root [
<!ENTITY quot "&#34;">
<!ENTITY amp "&#38;">
<!ENTITY nbsp "&#160;">
]>

XML;

$xml = '<a>&nbsp;</a>';

$document->loadXML($xml);
if (\is_null($document->doctype)) {
    $document = new \DOMDocument();
    $document->xmlVersion = '1.0';
    $document->encoding = 'UTF-8';
    $document->loadXML($doctype.$xml);
    echo $doctype.$xml."\n";
}

foreach (\libxml_get_errors() as $error) {
    // make it pretty and echo it
}

Here's the output:

<!DOCTYPE root [
<!ENTITY quot "&#34;">
<!ENTITY amp "&#38;">
<!ENTITY nbsp "&#160;">
]>
<a>&nbsp;</a>
Fatal Error 26: Entity 'nbsp' not defined

FYI, the answer is not "it looks like you are working with HTML, use loadHTML() instead of loadXML()." The code in question works with both HTML snippets and complete documents. This is also about being able to specify custom doctypes as the code in question may handle other doctypes or more general XML cases in the future.

Jay Bienvenu
  • 3,069
  • 5
  • 33
  • 44
  • Can't you use the `doctype` property? http://php.net/manual/en/class.domdocument.php#domdocument.props.doctype – Nick Nov 02 '18 at 00:26
  • No, it's readonly. If it doesn't exist you can't create it. – Jay Bienvenu Nov 02 '18 at 01:16
  • Possible duplicate of [Adding <!ENTITY nbsp " "> to the DOCTYPE using PHP dom](https://stackoverflow.com/questions/27876393/adding-entity-nbsp-160-to-the-doctype-using-php-dom) – miken32 Nov 02 '18 at 02:10
  • I was thinking that you could check for its existence with `doctype` and then if it didn't exist create a new `DOMDocument` by adding a doctype string to the input XML. – Nick Nov 02 '18 at 02:12

0 Answers0