For what it's worth, here is a way to do this in XSLT 1.0.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" />
<xsl:strip-space elements="*" />
<xsl:key name="name" match="*[@name]" use="
concat(@name, '|', ancestor::*[1]/@name, '|', ancestor::*[2]/@name)
" />
<xsl:template match="node() | @*">
<xsl:copy>
<xsl:apply-templates select="node() | @*" />
</xsl:copy>
</xsl:template>
<xsl:template match="*[@name]">
<xsl:variable name="myKey" select="
concat(@name, '|', ancestor::*[1]/@name, '|', ancestor::*[2]/@name)
" />
<xsl:variable name="myGroup" select="key('name', $myKey)" />
<xsl:if test="generate-id() = generate-id($myGroup[1])">
<xsl:copy>
<xsl:copy-of select="@*" />
<xsl:apply-templates select="$myGroup/*" />
</xsl:copy>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
outputs
<roots>
<root name="name1">
<layer1 name="name2">
<layer2 attribute="sowhat"/>
<layer2 attribute="justit"/>
<layer2 attribute="yeaha"/>
</layer1>
</root>
<root name="name2123">
<layer1 name="name2">
<layer2 attribute="itis"/>
</layer1>
</root>
</roots>
The key feature of XSLT is the ability to express complex transformations in relatively few lines of code. The above transformation is 29 lines of code and you could squeeze it even more.
I think a crash course in XSLT goes beyond the scope of this answer. Besides that, there are countless crash courses in XSLT available all over the Internet.
So what I do is I'll give a general overview of what happens here.
First off, I've defined two classes of elements for your input - those that are merge-able and those that are not. I've defined all elements that have a @name
attribute to be merge-able.
- All normal nodes (those without a
@name
) are copied as they are. The first <xsl:template>
does that (it's the identity template).
- I've defined a "merge-able group" of elements as those that share a common set of
@name
attribute values along their ancestors.
- To do that I create the concatenation of all relevant
@name
attributes for all elements that have them.
- For the time being, this transformation can handle groups that go 3 levels deep (
concat(@name, '|', ancestor::*[1]/@name, '|', ancestor::*[2]/@name)
).
- Add more levels in the same fashion if necessary.
- The group name (the key) for the parent of
sowhat
is name2|name1||
, this applies for the other <layer2>
in that logical group.
- Now whenever the XSLT engine encounters an element with a
@name
, it
- calculates the key for that element (
$myKey
).
- gets the group of elements that have the same key (
$myGroup
).
- finds out if the current element is the first element in the group, if so it copies it to the output
- effectively this groups elements by their key (this technique is called Muenchian grouping).
- then it takes a recursive step: it starts processing the children of that group (
$myGroup/*
).
- effectively this takes us back to square 0 and the algorithm starts from the beginning.
There are some assumptions/limitations in my code that might not necessarily align with your input.
- The elements ought to be merged by their
@name
and not by some other property.
- The elements with the same
@name
ancestry do not have special attributes, so throwing away every element but the first one in a certain group will not cause loss of data.
- There is a finite nesting depth.
- Mergeable elements are never the descendants of non-mergeable elements (no
<layer>
with a @name
inside a <layer>
without a @name
)
- Probably others that slip my mind right now.
Reading recommendations
- template matching and the general working mechanisms of an XSLT processor
- XSL default rules
- XPath
- XSL keys and Muenchian grouping
- the identity template
- the concept of the current node throughout the processing flow