0

I have XML that contains sibling nodes that have identical attribute values, but have different contents. This occurs at both the parent and the child level, as follows:

<myxml>
    <a myattr="valuetop1">
        <b myattr="valuechild1">
            <c>Stuff1</c>
            <c>Stuff2</c>
        </b>
    </a>
    <a myattr="valuetop1">
        <b myattr="valuechild2">
            <c>Stuff3</c>
        </b>
    </a>
    <a myattr="valuetop1">
        <b myattr="valuechild2">
            <c>Stuff4</c>
        </b>
    </a>
    <a myattr="valuetop1">
        <b myattr="valuechild2">
            <c>Stuff5</c>
            <c>Stuff6</c>
        </b>
    </a>
    <a myattr="valuetop2">
        <b myattr="valuechild1">
            <c>Stuff1</c>
        </b>
    </a>
    <a myattr="valuetop2">
        <b myattr="valuechild3">
            <c>Stuff2</c>
        </b>
    </a>
    <a myattr="valuetop2">
        <b myattr="valuechild2">
            <c>Stuff3</c>
            <c>Stuff2</c>
        </b>
    </a>
    <a myattr="valuetop2">
        <b myattr="valuechild2">
            <c>Stuff4</c>
        </b>
    </a>
</myxml>

If there are nodes with identical attribute values that exist at the same level, I want to combine their contents under a single instance of that node. In other words, I'm looking for a neat hierarchy like this:

<myxml>
    <a myattr="valuetop1">
        <b myattr="valuechild1">
            <c>Stuff1</c>
            <c>Stuff2</c>
        </b>
        <b myattr="valuechild2">
            <c>Stuff3</c>
            <c>Stuff4</c>
            <c>Stuff5</c>
            <c>Stuff6</c>
        </b>
    </a>        
    <a myattr="valuetop2">
        <b myattr="valuechild1">
            <c>Stuff1</c>
        </b>
        <b myattr="valuechild3">
            <c>Stuff2</c>
        </b>
        <b myattr="valuechild2">
            <c>Stuff3</c>
            <c>Stuff2</c>
            <c>Stuff4</c>
        </b>
    </a>    
</myxml>

The catch is that I don't know what the values of valuetopx or valuechildx will be. I've been banging my head over this one for a couple of days, but can't get my brain around it.

Squidx3
  • 92
  • 7
  • [**Why is "can't get my brain around it" not an actual question?**](https://meta.stackoverflow.com/q/284236/290085) – kjhughes Nov 19 '17 at 20:14
  • 1
    What you actually have here is a grouping problem. If you are using XSLT 1.0 read up on Muenchian Grouping (See http://www.jenitennison.com/xslt/grouping/muenchian.html). If you are using XSLT 2.0, read up on xsl:for-each-group. – Tim C Nov 19 '17 at 20:20
  • @TimC Looks like a decent answer. Would you like to make one? I would appreciate getting this out of the list of unanswered questions. – Yunnosch Nov 19 '17 at 20:42

1 Answers1

1

As mentioned in comments, you can use a technique called Muenchian Grouping in XSLT 1.0, but in your case you are doing it on two-levels.

First for the parent, you define the key like so

<xsl:key name="parent" match="a" use="@myattr" />

Then, for the child, you need to take into account both the parent ID and child ID (in the case where a child id may have different parent ids, and so would be a different group)

<xsl:key name="child" match="b" use="concat(../@myattr, '|', @myattr)" />

Then, to get the distinct parent ids, you do this....

<xsl:apply-templates select="a[generate-id() = generate-id(key('parent', @myattr)[1])]" />

And within a distinct parent, to get the distinct child elements, do this...

 <xsl:apply-templates select="key('parent', @myattr)/b
                              [generate-id() = generate-id(key('child', concat(../@myattr, '|', @myattr))[1])]" />

Try this XSLT

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output method="xml" indent="yes" />

  <xsl:key name="parent" match="a" use="@myattr" />
  <xsl:key name="child" match="b" use="concat(../@myattr, '|', @myattr)" />

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*" />
    </xsl:copy>
  </xsl:template>

  <xsl:template match="myxml">
    <xsl:copy>
      <xsl:apply-templates select="a[generate-id() = generate-id(key('parent', @myattr)[1])]" />
    </xsl:copy>
  </xsl:template>

  <xsl:template match="a">
    <xsl:copy>
      <xsl:apply-templates select="@*" />
      <xsl:apply-templates select="key('parent', @myattr)/b[generate-id() = generate-id(key('child', concat(../@myattr, '|', @myattr))[1])]" />
    </xsl:copy>
  </xsl:template>

  <xsl:template match="b">
    <xsl:copy>
      <xsl:apply-templates select="@*" />
      <xsl:apply-templates select="key('child', concat(../@myattr, '|', @myattr))/c" />
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>
Tim C
  • 70,053
  • 14
  • 74
  • 93