I am stuck with an XML to XML transformation using XSLT 2.0 where I need to transform this:
<p>some mixed content <x h="">START:attr="value"</x> more mixed content <x h="">END</x> other mixed content</p>
To this:
<p>some mixed content <ph attr="value"> more mixed content </ph> other mixed content</p>
So basically I'd like to replace <x h="">START:attr="value"</x>
with <ph attr="value">
and <x h="">END</x>
with </ph>
and process the rest as usual.
Does anyone know if that's possible?
My main issue is that I cannot figure out how to find the element with value END and then tell the XSLT processor (I use saxon) to process the content between the first occurence of and the second occurence of and finally write the end element . I am familiar with how to create an element (including attributes).
I have a specific template to match the start element START:attr="value". Since the XML document I process contains many other elements I'd prefer a recursive solution, so continue the processing of the found content between START and END by using other existing templates.
Sample XML (note that I don't know in advance if the parent will be a p element)
<p> my sample text <b>mixed</b> more
<x h="">START:attr="value"</x>
This is mixed content <i>REALLY</i>, process it normally
<x h="">END</x>
</p>
My Stylesheet
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="x[@h][starts-with(., 'START:')]">
<ph>
<xsl:for-each-group select="../*" group-starting-with="x[@h][. = 'START:']">
<xsl:for-each-group select="current-group()" group-ending-with="x[@h][. = 'END']">
<xsl:apply-templates select="@*|node()|text()"/>
</xsl:for-each-group>
</xsl:for-each-group>
</ph>
</xsl:template>
<xsl:template match="x[@h][starts-with(., 'END')]"/>
<xsl:template match="node()|@*">
<xsl:copy copy-namespaces="no">
<xsl:apply-templates select="node()|@*" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Result
<?xml version="1.0" encoding="UTF-8"?>
<p> my sample text <b>mixed</b> more
<ph>mixed</ph>
This is mixed content <i>REALLY</i>, process it normally
</p>
I cannot figure out how to put the complete content between START and END within the tags. Any ideas?