2

I'm trying to produce a valid XHTML document from XML data.
I'm doing so using MSXML object library, not .NET. With .NET there are no problems, transforms just fine.

My XSL template has this:

<xsl:output
  method="xml"
  omit-xml-declaration="no"
  indent="no"
  version="1.0"
  encoding="utf-8"
  doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN"
  doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
/>

Then goes:

<xsl:template match="/root">
  <html xmlns="http://www.w3.org/1999/xhtml">
  ...
  </html>
</xsl:template>

And there come problems.

  • If I use MSXML2.DOMDocument40, MSXML refuses to generate the XHTML because

    The attribute '{xmlns}' on this element is not defined in the DTD/Schema.

    Apparenty, one of the HTML tags in the template body is not allowed to have the namespace it inherits from <html>. But MSXML won't tell me which tag that is.

    If I just strip out everything from the template and dump the XML data enclosed in <p>, then it transforms fine. Apparently, <p> is allowed to have xmlns.

    What tag is that, which ruins everything for me?

  • If I use MSXML2.DOMDocument60, I first have to say:

    xmlTransformedResult.setProperty("ProhibitDTD", False)
    

    , otherwise I get "DTD is prohibited."

    Having that setting set, I get:

    The element 'html' is used but not declared in the DTD/Schema.

    How can I fix that?

  • If I use .NET transformation, it's all fine. The generated document starts with

    <?xml version="1.0" encoding="utf-8"?>
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
    

Now, I can remove both doctype-public and doctype-system from the template, produce just plain XML, and then manually prepend the header to it. But I don't wan't to. What is the proper way of making this work?

GSerg
  • 76,472
  • 17
  • 159
  • 346
  • How do you use MSXML, with script? Which API exactly do you use? If you use transformToObject and xmlTransformedResult is an MSXML DOM document then try setting `xmlTransformedResult.validateOnParse = false` before running the transformation. – Martin Honnen Aug 03 '11 at 11:48
  • @Martin Yes. I use `transformNodeToObject` from both VB6 and VBA. `validateOnParse` does help, but then, is that a correct way? (I would acceps it is, provided the problem is in MSXML. Otherwise, I'm rather willing to fix my code.) – GSerg Aug 03 '11 at 11:53

1 Answers1

1

I think the problem with MSXML 6 is that by default it neither allows DTDs nor it loads them (or any external resources in general). So to avoid the validation message you need to set both (I am using JScript syntax, please adjust to your language of choice):

xmlTransformedResult.resolveExternals = true;
xmlTraansformedResult.setProperty('ProhibitDTD', false);

Then I think you won't get the validation error. At least as long as the W3C is going to serve up the XHTML DTD files, I think when you do that programmatically a lot you might get errors but that does not depend on MSXML, that is simply a W3C policy to avoid too much traffic on their servers by everyone fetching such DTDs.

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • Ouch. I would prefer it to *not* go anywhere and actually download dtds. In fact, when I was playing with it and I saw it actually went on the Internet, I was rather amazed. I used to think about these four standard DTDs (HTML/XML strict/transitional) as of known, pre-existing things, much like the standard XML namespaces which no sane processor is going to download despite the namespace looks like an url. BTW, it fails a lot, and it has to wait for like 30 seconds before failing, which, obviously, slows everything down enormously. – GSerg Aug 03 '11 at 18:40
  • Well you don't have to download the DTDs and validate against them, simply let resolveExternals in its MSXML default value of false, set validateOnParse to false, as originally suggested, and I think you can write out XHTML just fine. As for XHTML DTDs being known to the XML parser, no, I don't think MSXML has built-in knowledge of these. .NET 4.0 allows you to use http://msdn.microsoft.com/en-us/library/system.xml.resolvers.xmlpreloadedresolver%28v=VS.100%29.aspx but I am not aware of anything comparable on the MSXML side. – Martin Honnen Aug 04 '11 at 10:32