1

I'm trying to group the data together by summing a node adjacently. Example

<root>
    <row id="AAA" val="2"/>
    <row id="BBB" val="3"/>
    <row id="CCC" val="1"/>
    <row id="DDD" val="4"/>
    <row id="EEE" val="6"/>
    <row id="FFF" val="3"/>
    <row id="GGG" val="6"/>
    <row id="HHH" val="8"/>
    <row id="III" val="3"/>
    <row id="JJJ" val="4"/>
    <row id="KKK" val="2"/>
    <row id="LLL" val="1"/>
</root>

Let's say I have a parameter of 10 then, then every time the values sum to 10 or less than 10, they should be grouped together. And the result should be

<root>
    <grouped>
        <row id="AAA" val="2"/>
        <row id="BBB" val="3"/>
        <row id="CCC" val="1"/>
        <row id="DDD" val="4"/>
    </grouped>
    <grouped>
        <row id="EEE" val="6"/>
        <row id="FFF" val="3"/>
    </grouped>
    <grouped>
        <row id="GGG" val="6"/>
    </grouped>
    <grouped>
        <row id="HHH" val="8"/>
    </grouped>
    <grouped>
        <row id="III" val="3"/>
        <row id="JJJ" val="4"/>
        <row id="KKK" val="2"/>
        <row id="LLL" val="1"/>
    </grouped>
</root>

I tried with group-adjacent with sum(current/@val + following-sibling::row/@val le 10) then tried group-by(sum(@val)) but I can see my basic approach is incorrect. Now I'm wondering, is this even possible. So I thought I'd ask the experts!

Thanks!

Adrian
  • 59
  • 6

2 Answers2

5

In XSLT 1 you could use sibling recursion, in XSLT 3 it is easier but a bit verbose to use xsl:iterate:

  <xsl:template match="root">
      <xsl:copy>
          <xsl:iterate select="row">
              <xsl:param name="sum" as="xs:integer" select="0"/>
              <xsl:param name="group" as="element(row)*" select="()"/>
              <xsl:on-completion>
                  <xsl:if test="$group">
                      <group>
                          <xsl:copy-of select="$group"/>
                      </group>
                  </xsl:if>
              </xsl:on-completion>
              <xsl:variable name="current-sum" select="$sum + xs:integer(@val)"/>
              <xsl:if test="$current-sum > 10">
                  <group>
                    <xsl:copy-of select="$group"/>
                  </group>
              </xsl:if>
              <xsl:next-iteration>
                  <xsl:with-param name="sum" select="if ($current-sum > 10) then xs:integer(@val) else $current-sum"/>
                  <xsl:with-param name="group" select="if ($current-sum > 10) then . else ($group, .)"/>
              </xsl:next-iteration>
          </xsl:iterate>
      </xsl:copy>
  </xsl:template>

https://xsltfiddle.liberty-development.net/6pS2B6o

As an alternative, you could use an accumulator that sums the @val values and "remembers" when a "group" has been established, then in the grouping you can use group-starting-with to check the accumulator:

  <xsl:param name="max" as="xs:integer" select="10"/>

  <xsl:mode on-no-match="shallow-copy" use-accumulators="#all"/>

  <xsl:output method="xml" indent="yes"/>

  <xsl:accumulator name="window" as="item()*" initial-value="()">
      <xsl:accumulator-rule match="root" select="(0, true())"/>
      <xsl:accumulator-rule match="root/row"
        select="let $val := xs:integer(@val),
                    $sum := $value[1],
                    $window-start := $value[2],
                    $current-sum := $sum + $val
                return
                    if ($current-sum gt $max)
                    then ($val, true())
                    else ($current-sum, false())"/>
  </xsl:accumulator>

  <xsl:template match="root">
      <xsl:copy>
          <xsl:for-each-group select="row" group-starting-with="*[accumulator-before('window')[2]]">
              <grouped>
                  <xsl:apply-templates select="current-group()"/>
              </grouped>
          </xsl:for-each-group>
      </xsl:copy>
  </xsl:template>

https://xsltfiddle.liberty-development.net/6pS2B6o/1

You can even make that streamable (with the help of Michael Kay):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="#all" version="3.0">

    <xsl:param name="max" as="xs:integer" select="10"/>

    <xsl:mode on-no-match="shallow-copy" use-accumulators="#all" streamable="yes"/>

    <xsl:output method="xml" indent="yes"/>

    <xsl:accumulator name="window" as="item()*" initial-value="()" streamable="yes">
        <xsl:accumulator-rule match="root" select="(0, true())"/>
        <xsl:accumulator-rule match="root/row"
            select="
                let $val := xs:integer(@val),
                    $sum := $value[1],
                    $window-start := $value[2],
                    $current-sum := $sum + $val
                return
                    if ($current-sum gt $max)
                    then
                        ($val, true())
                    else
                        ($current-sum, false())"
        />
    </xsl:accumulator>

    <xsl:template match="root">
        <xsl:copy>
            <xsl:for-each-group select="row"
                group-starting-with="*[boolean(accumulator-before('window')[2])]">
                <grouped>
                    <xsl:apply-templates select="current-group()"/>
                </grouped>
            </xsl:for-each-group>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>
Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • Thanks you so much! I'm using the first solution because I understand what's happening! The other solutions, I need to figure out what's happening! I just realized I dont know xslt :( – Adrian Jun 17 '20 at 08:00
5

The xsl:for-each-group instruction can't handle this requirement.

As well as Martin's suggestions, another 3.0 approach is fold-left:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:array="http://www.w3.org/2005/xpath-functions/array"
  exclude-result-prefixes="#all"
  version="3.0">

  <xsl:param name="max" as="xs:integer" select="10"/>

  <xsl:template match="root">
    <xsl:copy>
      <xsl:variable name="groups" select="
        fold-left(row, ([]), function($groups, $next) {
           if (sum(head($groups)?*/@val) + $next/@val le $max)
           then (array:append(head($groups), $next), tail($groups))
           else ([$next], $groups)
        }) => reverse()"/>
      <xsl:for-each select="$groups">
        <grouped>
          <xsl:copy-of select="?*"/>
        </grouped>
      </xsl:for-each>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

This builds the groups as a sequence of arrays, one array per group, initially in reverse order: the callback function is executed once for each row, and it adds the row to the first (i.e. last) group if the total is within your threshold, otherwise it starts a new group.

(Why in reverse order? Largely because head() and tail() are convenient, and there's no equivalent for getting the last item and "all except the last").

Michael Kay
  • 156,231
  • 11
  • 92
  • 164