2

I had an xls book saved as xml. There I have 1st rows with header and other rows with data. In xml it looks:

<Row ss:AutoFitHeight="0">
    <Cell><Data ss:Type="String">Header 1 - id</Data><NamedCell
      ss:Name="_FilterDatabase"/></Cell>
    <Cell><Data ss:Type="String">Header 2 - Version</Data><NamedCell ss:Name="_FilterDatabase"/></Cell>
    <Cell><Data ss:Type="String">Header 3 - some data...</Data><NamedCell
      ss:Name="_FilterDatabase"/></Cell>
</Row>
<Row ss:AutoFitHeight="0">
    <Cell><Data ss:Type="String">id001</Data><NamedCell
      ss:Name="_FilterDatabase"/></Cell>
    <Cell><Data ss:Type="Number">1</Data><NamedCell ss:Name="_FilterDatabase"/></Cell>
    <Cell><Data ss:Type="String">blabla</Data><NamedCell
      ss:Name="_FilterDatabase"/></Cell>        
</Row>
<Row ss:AutoFitHeight="0">
    <Cell><Data ss:Type="String">id001</Data><NamedCell
      ss:Name="_FilterDatabase"/></Cell>
    <Cell><Data ss:Type="Number">2</Data><NamedCell ss:Name="_FilterDatabase"/></Cell>
    <Cell><Data ss:Type="String">blabla</Data><NamedCell
      ss:Name="_FilterDatabase"/></Cell>        
</Row>
<Row ss:AutoFitHeight="0">
    <Cell><Data ss:Type="String">id002</Data><NamedCell
      ss:Name="_FilterDatabase"/></Cell>
    <Cell><Data ss:Type="Number">1</Data><NamedCell ss:Name="_FilterDatabase"/></Cell>
    <Cell><Data ss:Type="String">blabla</Data><NamedCell
      ss:Name="_FilterDatabase"/></Cell>        
</Row>

So I have two main fields - Header 1 and Header 2, first contains id (Not unique!!!) the second - version or edition.
How is it looks in Excel:

Number  Version
A001    1
A001    6
A002    2
A002    3
A003    1

I need to process all unique id's with the maximum version! Here I would like to get

A001    6
A002    3
A003    1

My xslt code is:

 <xsl:template match="orig:Table" name ="Table">
    <xsl:variable name="Id" select ="'Number'"/>
    <xsl:element name="Declarations">
      <xsl:for-each select="orig:Row">
        <xsl:sort select="orig:Cell[1]/orig:Data" data-type="text" order="ascending"/>
        <xsl:sort select="orig:Cell[2]/orig:Data" data-type="number" order="descending"/>
        <xsl:if test="$Id!=orig:Cell[1]/orig:Data">
          <xsl:call-template name="Row"> <!--Here in template "Row" all further processing will be done-->
        </xsl:if>

      </xsl:for-each>
    </xsl:element>
  </xsl:template>

At first I'm sorting them to get the list like:

Number  Version
A001    6
A001    1
A002    3
A002    2
A003    1

Then I want to save every 1st header value and compare it to itself in the next row - if the value is the same, that means that it is the same record with older version and we skip it. But if value changes - this mean that we have a new record with maximum version and we should take it. Eg:

"Number" != A001    6 => we take this
A001 = A001 1 => skip
A001 != A002     3 => we take this
and so on

It would be easy to do if I could use inside "IF" construction like

variable name="Id" := orig:Cell[1]/orig:Data

But it is impossible here!

Please Help!

  • Can you use XSLT 2.0 or are you limited to 1.0 only? – Ian Roberts Jan 27 '15 at 13:20
  • And in the example you've given the original XML seems to be ordered from smallest to largest version number within each ID, is that guaranteed or might there be cases where the versions are listed out of order? – Ian Roberts Jan 27 '15 at 13:22
  • No, only XSLT 1.x. In my example data in the 1st header is sorted from smallest to largest but I assume that in other case it could be out of order. But for now any decision would be good, even for variant when they are from smallest to highest. – Trinadsatiy Jan 27 '15 at 13:43

1 Answers1

0

This is basically a grouping problem - you are trying to group rows by their first cell value, then produce one output row per group from the latest version within that group. The standard approach in XSLT 1.0 is called Muenchian grouping and involves defining a key that groups related nodes and a then using trick with generate-id to pull out just the first node in each group:

 <xsl:key name="rowById" match="orig:Row" use="orig:Cell[1]/orig:Data" />

 <xsl:template match="orig:Table" name ="Table">
    <xsl:variable name="Id" select ="'Number'"/>
    <xsl:element name="Declarations">
      <!-- Muenchian grouping - one "iteration" per unique idNNN value -->
      <xsl:for-each select="orig:Row[
           generate-id() = generate-id(key('rowById', orig:Cell[1]/orig:Data)[1])]">
        <!-- sort groups by ID -->
        <xsl:sort select="orig:Cell[1]/orig:Data" data-type="text" order="ascending"/>

        <!-- for-each over the members of this group -->
        <xsl:for-each select="key('rowById', orig:Cell[1]/orig:Data)">
          <!-- find the maximum version value within this group -->
          <xsl:sort select="orig:Cell[2]/orig:Data" data-type="number" order="descending"/>
          <xsl:if test="position() = 1">
            <xsl:call-template name="Row"> <!--Here in template "Row" all further processing will be done-->
          </xsl:if>
        </xsl:for-each>
      </xsl:for-each>
    </xsl:element>
  </xsl:template>
Ian Roberts
  • 120,891
  • 16
  • 170
  • 183