9

I have a xml fragment like below

<Detail uid="6">
    <![CDATA[
    <div class="heading">welcome to my page</div>
    <div class="paragraph">this is paraph</div>
    ]]>
</Detail>

and I want to be able to change the

<div class="heading">...</div> to <h1>Welcome to my page</h1>
<div class="paragraph">...</div> to <p>this is paragraph</p>

do you know how I can do that in xslt 1.0

Mads Hansen
  • 63,927
  • 12
  • 112
  • 147
jjennifer
  • 1,285
  • 4
  • 12
  • 22
  • 2
    By definition, CDATA sections doesn't needs to have a well-formed content, so probably isn't safe to assume that can be parsed as XML – Rubens Farias Jan 14 '10 at 20:03
  • in this case, it's well-formed. the structure of the xml's been designed that way. is it possible to remove teh <![CDATA so I can access those element inside as usual xml element? – jjennifer Jan 14 '10 at 20:08

3 Answers3

9

What about running two transforms.

Pass 1.)

<?xml version="1.0" encoding="UTF-8"?>
  <xsl:stylesheet
   version="1.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes" encoding="UTF-8"/>

  <xsl:template match="/">
    <xsl:apply-templates />
  </xsl:template>

    <xsl:template match="Detail">
        <Detail>
            <xsl:copy-of select="@*"/>
        <xsl:value-of select="." disable-output-escaping="yes" />
        </Detail>
    </xsl:template>

</xsl:stylesheet>

Will produce:

<?xml version="1.0" encoding="UTF-8"?>
<Detail uid="6"> 
    <div class="heading">welcome to my page</div>
    <div class="paragraph">this is paraph</div>
</Detail>

Pass 2.)

<?xml version="1.0" encoding="UTF-8"?>
  <xsl:stylesheet
   version="1.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes" encoding="UTF-8"/>

  <xsl:template match="/">
    <xsl:apply-templates />
  </xsl:template>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*| node()" />
        </xsl:copy>
    </xsl:template>

    <xsl:template match="div[@class='heading']">
        <h1><xsl:value-of select="."/></h1>
    </xsl:template>

    <xsl:template match="div[@class='paragraph']">
        <p><xsl:value-of select="."/></p>
    </xsl:template>

</xsl:stylesheet>

Produces:

<?xml version="1.0" encoding="UTF-8"?>
<Detail uid="6">
<h1>welcome to my page</h1>
<p>this is paraph</p>
</Detail>
Mads Hansen
  • 63,927
  • 12
  • 112
  • 147
  • Hi Mads! I now you are XSLT guru. If it possible could answer to this question - http://stackoverflow.com/questions/18612639/transform-xml-from-cdata-using-xsl ? – Eazy Sep 05 '13 at 04:06
2

You cannot tell XSL 1.0 to fish a string out of a CDATA and parse it as XML.

bmargulies
  • 97,814
  • 39
  • 186
  • 310
2

You can't "remove" the CDATA, but you can achieve the desired output somewhat crudely:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
   <Detail>
        <xsl:variable name="before" select="substring-before(//Detail,'&lt;div class=&quot;heading&quot;&gt;')" />
        <xsl:variable name="afteropen" select="substring-after(//Detail,'&lt;div class=&quot;heading&quot;&gt;')" />
        <xsl:variable name="body" select="substring-before($afteropen, '&lt;/div&gt;')" />
        <xsl:variable name="after" select="substring-after($afteropen, '&lt;/div&gt;')" />
        <xsl:value-of select="concat($before, '&lt;h1&gt;', $body, '&lt;/h1&gt;',$after)"
                disable-output-escaping="yes"       />
   </Detail>
</xsl:template>
</xsl:stylesheet>

This will work for the first type of div you're trying to parse and you can follow something similar with the second one. It could be made more generic with some effort.

Dan
  • 10,990
  • 7
  • 51
  • 80