1

I have an XSLT that creates some CDATA within a node.

XML:

<test><inner>stuff</inner></test>

XSLT:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="2.0">
    <xsl:output method="xml" indent="yes"/>
    <xsl:template match="test">
        <wrapper>
                <xsl:text disable-output-escaping="yes">&lt;![CDATA[</xsl:text>
                <xsl:copy-of select="*"/>
                <xsl:text disable-output-escaping="yes">]]&gt;</xsl:text>
        </wrapper>
    </xsl:template>
</xsl:stylesheet>

This transform, executed via Saxon, returns:

<wrapper><![CDATA[<inner>stuff</inner>]]></wrapper>

I am aware that I am wrapping XML in CDATA and that this is kind of ridiculous. But this is what is expected by an API that I am working with, so I have no choice but to follow this pattern.

Now I am trying to include this transform as part of a larger XProc pipeline:

<p:pipeline xmlns:p="http://www.w3.org/ns/xproc" version="1.0" >
<p:xslt>
    <p:input port="stylesheet">
        <p:document href="test.xsl" />
    </p:input>
</p:xslt>

Which returns (using the latest version of Calabash):

<wrapper>&lt;![CDATA[<inner>stuff</inner>]]&gt;</wrapper>

It seems that XProc doesn't honor the disable-output-escaping attribute.

I went on to try a few XProc functions including p:unescape-markup and various combinations of p:string-replace, but I couldn't find a solution that didn't adversely impact the rest of my output.

Any ideas what I might try next?

rexsavior
  • 75
  • 4

1 Answers1

3

An XSLT processor is not required to support d-o-e:

An XSLT processor will only be able to disable output escaping if it controls how the result tree is output. This may not always be the case. For example, the result tree may be used as the source tree for another XSLT transformation instead of being output.

This is especially true in pipelining: XSLT may not control serialization of the output tree, but only pass it on to the next step in the pipeline as a DOM or as SAX events. But even if it could,

An XSLT processor is not required to support disabling output escaping. If an xsl:value-of or xsl:text specifies that output escaping should be disabled and the XSLT processor does not support this, the XSLT processor may signal an error; if it does not signal an error, it must recover by not disabling output escaping.

So you really can't rely on d-o-e, especially in a pipeline.

But this is what is expected by an API that I am working with, so I have no choice but to follow this pattern.

I can sympathize with the situation, having used faulty tools in the past because they were the best available. However, the presence (and boundaries) of a CDATA section are explicitly not in the XML Infoset. So an API that depends on CDATA sections is faulty with regard to its XML input requirements. If it truly does depend on CDATA sections, it would be a good idea to file a bug report about it.

On the other hand, maybe the API you're working with doesn't actually require CDATA sections; maybe it just requires that you feed it XML that's escaped in some way? If so, there are other ways to accomplish that, without requiring a specific serialization that is outside of the XML Infoset. If you can show us documentation about the API, we could help determine what it actually requires.

LarsH
  • 27,481
  • 8
  • 94
  • 152
  • 1
    Thank you for your answer. It is pretty much the answer I expected as the output I needed was non-standard. Your suggestion to see if simply escaping the XML would work in lieu of the CDATA worked out. I am sorry I hadn't thought to try it, but I was intent on adhering to the API documentation. So, my issue seems to be solved, assuming I can successfully insert the escaped XML in XProc. I will also be giving the API designers some feedback regarding the proliferation of CDATA in their documentation. Thanks! – rexsavior Jul 16 '15 at 13:48
  • @rexsavior: Glad to hear that the API doesn't actually require CDATA, and also that you will be giving them feedback about the way CDATA is used in the documentation. If they want to be compatible with standard XML tools, it's important that they describe the requirement in its general terms (escaped XML) rather than the too-restrictive CDATA (one particular serialization of escaped XML). – LarsH Jul 16 '15 at 14:02