0

Yes, this is another XSLT grouping/duplicates questions, but I was unable to find any answers that I could successfully apply to my situation.

This is the original XML:

<data jsxid="jsxroot">
   <record jsxid="10" groupNum="1319" item="q123"  total="1"/>
   <record jsxid="20" groupNum="1319" item="w123"  total="1"/>
   <record jsxid="30" groupNum="1319" item=""  total="0"/>
   <record jsxid="40" groupNum="1322" item="z123" total="1"/>
   <record jsxid="50" groupNum="1322" item="x123" total="1"/>
   <record jsxid="60" groupNum="1322" item="c123" total="1"/>
   <record jsxid="70" groupNum="1322" item="" total="0"/>
   <record jsxid="80" groupNum="1323" item="x123" total="1"/>
   <record jsxid="90" groupNum="1323" item="c123" total="1"/>
   <record jsxid="100" groupNum="1323" item="z123" total="1"/>
   <record jsxid="110" groupNum="1323" item="" total="0"/>
</data>

First, I need it grouped by attribute "groupNum" and wrapped in a parent element jsxid that increments with each group, so that the output looks like this:

<data jsxid="jsxroot">
    <record jsxid="1">
        <record jsxid="10" groupNum="1319" item="q123"  total="1"/>
        <record jsxid="20" groupNum="1319" item="w123"  total="1"/>
        <record jsxid="30" groupNum="1319" item=""  total="0"/>
    </record>
    <record jsxid="2">  
        <record jsxid="40" groupNum="1322" item="z123" total="1"/>
        <record jsxid="50" groupNum="1322" item="x123" total="1"/>
        <record jsxid="60" groupNum="1322" item="c123" total="1"/>
        <record jsxid="70" groupNum="1322" item="" total="0"/>
    </record>
    <record jsxid="3">  
        <record jsxid="80" groupNum="1323" item="x123" total="1"/>
        <record jsxid="90" groupNum="1323" item="c123" total="1"/>
        <record jsxid="100" groupNum="1323" item="z123" total="1"/>
        <record jsxid="110" groupNum="1323" item="" total="0"/>
    </record>
</data>

I was able to accomplish that with this stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output method="xml" indent="yes" />

    <xsl:key name="groupNum" match="data/*" use="@groupNum" />

    <xsl:template match="data">
    <data>
            <xsl:apply-templates select="*[generate-id(.)=generate-id(key('groupNum',@groupNum)[1])]"/>
    </data>
    </xsl:template>


    <xsl:template match="*">
        <record jsxid="{position()}" >
          <xsl:copy-of select="key('groupNum', @groupNum)" />
        </record>
    </xsl:template>

</xsl:stylesheet>

Now I need to remove any grouping that contains the exact same items as another grouping for records where total = "1". If you look at the groupings of jsxid = 2 and jsxid = 3 above, you'll see that they both contain the following item attributes, though not in the same order:

item="x123"
item="c123"
item="z123"

So I want to remove the entire jsxid = 3 grouping and have the final output look like this:

<data jsxid="jsxroot">
    <record jsxid="1">
        <record jsxid="1" groupNum="1319" item="q123"  total="1"/>
        <record jsxid="2" groupNum="1319" item="w123"  total="1"/>
        <record jsxid="3" groupNum="1319" item=""  total="0"/>
    </record>
    <record jsxid="2">  
        <record jsxid="4" groupNum="1322" item="z123" total="1"/>
        <record jsxid="5" groupNum="1322" item="x123" total="1"/>
        <record jsxid="6" groupNum="1322" item="c123" total="1"/>
        <record jsxid="7" groupNum="1322" item="" total="0"/>
    </record>
</data>

EDIT: Instead of removing the duplicate grouping, what if I wanted to add in a new attribute as a "grouping identifier." I'd call it something like "groupType" so the output for the above situation where the last two groupings are the same, they would have the same grouptype:

<data jsxid="jsxroot">
    <record jsxid="1">
        <record jsxid="10" groupNum="1319" item="q123"  groupType = "type1" total="1"/>
        <record jsxid="20" groupNum="1319" item="w123"  groupType = "type1" total="1"/>
        <record jsxid="30" groupNum="1319" item=""  groupType = "type1" total="0"/>
    </record>
    <record jsxid="2">  
        <record jsxid="40" groupNum="1322" item="z123" groupType = "type2" total="1"/>
        <record jsxid="50" groupNum="1322" item="x123" groupType = "type2" total="1"/>
        <record jsxid="60" groupNum="1322" item="c123" groupType = "type2" total="1"/>
        <record jsxid="70" groupNum="1322" item="" groupType = "type2" total="0"/>
    </record>
    <record jsxid="3">  
        <record jsxid="80" groupNum="1323" item="x123" groupType = "type2" total="1"/>
        <record jsxid="90" groupNum="1323" item="c123" groupType = "type2" total="1"/>
        <record jsxid="100" groupNum="1323" item="z123" groupType = "type2" total="1"/>
        <record jsxid="110" groupNum="1323" item="" groupType = "type2" total="0"/>
    </record>
</data>

Any help is greatly appreciated.

2 Answers2

0

This is not going to be simple, so hang onto your seat:

XSLT 1.0

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
extension-element-prefixes="exsl">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:variable name="root" select="/" />

<xsl:key name="rec-by-group" match="record" use="@groupNum" />
<xsl:key name="group-by-items" match="group" use="@items" />

<xsl:template match="/">
    <!-- first pass -->
    <xsl:variable name="groups">
        <!-- for each unique groupNum ... -->
        <xsl:for-each select="data/record[count(. | key('rec-by-group', @groupNum)[1]) = 1]">
            <xsl:variable name="groupNum" select="@groupNum" />
            <!-- ... create a group ... -->
            <group groupNum="{$groupNum}">
                <!-- ... and concatenate all its items - in known order(!) - into a single attribute. -->
                <xsl:attribute name="items">
                    <!-- switch context back to document in order to use key -->
                    <xsl:for-each select="$root">
                        <xsl:for-each select="key('rec-by-group', $groupNum)[@total=1]">
                            <xsl:sort select="@item" data-type="text" order="ascending"/>
                            <xsl:value-of select="@item"/>
                            <xsl:text>|</xsl:text>
                        </xsl:for-each>
                    </xsl:for-each>
                </xsl:attribute>
            </group>
        </xsl:for-each>
    </xsl:variable>

    <!-- final pass -->
    <data jsxid="jsxroot">
        <!-- for each unique group (unique by its @items attribute) ... -->
        <xsl:for-each select="exsl:node-set($groups)/group[count(. | key('group-by-items', @items)[1]) = 1]">
            <!-- ... create a wrapper record ... -->
            <record jsxid="{position()}">
                <xsl:variable name="groupNum" select="@groupNum" />
                <!-- switch context back to document in order to use key -->
                <xsl:for-each select="$root">
                    <!-- ... and list the records in this group. -->
                    <xsl:for-each select="key('rec-by-group', $groupNum)">
                        <record>
                            <xsl:attribute name="jsxid">
                                <xsl:number/>
                            </xsl:attribute>
                            <xsl:copy-of select="@groupNum | @item | @total"/>
                        </record>
                    </xsl:for-each>
                </xsl:for-each>
            </record>
        </xsl:for-each>
    </data>
</xsl:template>

</xsl:stylesheet>

Edit:

in response to your edit:

If you can live with groupType being an arbitrary (though unique) number, you can change the "final pass" part to something like:

<!-- final pass -->
<data jsxid="jsxroot">
    <!-- for each group ... -->
    <xsl:for-each select="exsl:node-set($groups)/group">
        <!-- ... create a wrapper record ... -->
        <record jsxid="{position()}">
            <xsl:variable name="groupNum" select="@groupNum" />
            <xsl:variable name="groupType" select="key('group-by-items', @items)[1]/@groupNum" />
            <!-- switch context back to document in order to use key -->
            <xsl:for-each select="$root">
                <!-- ... and list the records in this group. -->
                <xsl:for-each select="key('rec-by-group', $groupNum)">
                    <record groupType="{$groupType}">
                        <xsl:attribute name="jsxid">
                            <xsl:number/>
                        </xsl:attribute>
                        <xsl:copy-of select="@groupNum | @item | @total"/>
                    </record>
                </xsl:for-each>
            </xsl:for-each>
        </record>
    </xsl:for-each>
</data>

Using your input example, this would return:

<?xml version="1.0" encoding="UTF-8"?>
<data jsxid="jsxroot">
   <record jsxid="1">
      <record groupType="1319" jsxid="1" groupNum="1319" item="q123" total="1"/>
      <record groupType="1319" jsxid="2" groupNum="1319" item="w123" total="1"/>
      <record groupType="1319" jsxid="3" groupNum="1319" item="" total="0"/>
   </record>
   <record jsxid="2">
      <record groupType="1322" jsxid="4" groupNum="1322" item="z123" total="1"/>
      <record groupType="1322" jsxid="5" groupNum="1322" item="x123" total="1"/>
      <record groupType="1322" jsxid="6" groupNum="1322" item="c123" total="1"/>
      <record groupType="1322" jsxid="7" groupNum="1322" item="" total="0"/>
   </record>
   <record jsxid="3">
      <record groupType="1322" jsxid="8" groupNum="1323" item="x123" total="1"/>
      <record groupType="1322" jsxid="9" groupNum="1323" item="c123" total="1"/>
      <record groupType="1322" jsxid="10" groupNum="1323" item="z123" total="1"/>
      <record groupType="1322" jsxid="11" groupNum="1323" item="" total="0"/>
   </record>
</data>

As you can see, it uses the groupNum of the first group of the current type as the groupType value. Alternatively, you could also use the auto-generated id of that group:

<xsl:variable name="groupType" select="generate-id(key('group-by-items', @items)[1])" />.
michael.hor257k
  • 113,275
  • 6
  • 33
  • 51
  • Brilliant. Works great. Now I just need to stare it at for the rest of the day to try to understand it. – Gabe Altenhofen Oct 08 '14 at 18:17
  • @GabeAltenhofen I have added some comments - HTH. – michael.hor257k Oct 08 '14 at 18:30
  • I edited my original question and I'm wondering if the extra requirement is a "simple" fix. These multi-pass solutions are more than I can get my head around. – Gabe Altenhofen Oct 14 '14 at 23:11
  • @GabeAltenhofen I don't think it's a simple fix. You still need to reduce the number of groups to 2, so that you have a source for the `groupType` values. Outputting the original 3 groups is trivial (just remove the Muenchian grouping in the final pass) - but you need each group to relate itself to one of the groups in the reduced list. IOW, you need to add yet another pass. – michael.hor257k Oct 14 '14 at 23:28
  • @GabeAltenhofen On second thought, this *could* be a simple fix, if ... (see the edit to my answer). – michael.hor257k Oct 15 '14 at 00:26
  • Once again, this is spot on and I appreciate the extra comments in the code. – Gabe Altenhofen Oct 15 '14 at 16:06
-1

To only copy the first instance, When generating each item, enclose it in a condition using count(preceding-sibling::*[]) = 0.

<xsl:template match="record">
    <xsl:variable name="item" select="item"/>
    <record jsxid="{position()}" >
        <xsl:if test="count (preceding-sibling::*[item=$item])=0">
           <xsl:copy-of select="key('groupNum', @groupNum)" />
        </xsl:if>
    </record>
</xsl:template>
Mike
  • 2,721
  • 1
  • 15
  • 20