xsl:sort: sorting by numeric value

Question

I have to sort out the codes in numerical order. The codes have four characters and four numerals.

for example,

COMP2100
COMP2400
COMP3410
LAWS2202
LAWS2250

when I just do <xsl:sort select="code" order="ascending" /> it displays above result.

However, I want that to be in 'numerical order' that is

COMP2100
LAWS2202
COMP2250
COMP2400
COMP3410

How do I do this?

score 11 · Accepted Answer · edited May 23 '17 at 12:33

Note: the OP has now provided sample XML. The below theories can be trivially adapted to this XML.

I. XSLT 1.0 (part 1)

Here is a simple solution that assumes your assertion ("the codes have four characters and four numerals") will always be the case:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
  <xsl:output omit-xml-declaration="no" indent="yes" />
  <xsl:strip-space elements="*" />

  <xsl:variable name="vNums" select="'1234567890'" />

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*" />
    </xsl:copy>
  </xsl:template>

  <xsl:template match="/*">
    <t>
      <xsl:apply-templates>
        <xsl:sort select="substring(., 5)"
          data-type="number" />
      </xsl:apply-templates>
    </t>
  </xsl:template>
</xsl:stylesheet>

...is applied to an imagined XML document, shuffled into random order:

<?xml version="1.0" encoding="utf-8"?>
<t>
  <i>COMP3410</i>
  <i>LAWS2202</i>
  <i>COMP2400</i>
  <i>COMP2100</i>
  <i>LAWS2250</i>
</t>

...the correct result is produced:

<?xml version="1.0" encoding="utf-8"?>
<t>
  <i>COMP2100</i>
  <i>LAWS2202</i>
  <i>LAWS2250</i>
  <i>COMP2400</i>
  <i>COMP3410</i>
</t>

Explanation:

The Identity Transform -- one of the (if not the) most fundamental design patterns in XSLT -- copies all nodes from the source XML document to the result XML document as-is.
One template overrides the Identity Transform by sorting all children of <t> based upon the characters in the string from position 5 to the string's end.

Again, note that this solution assumes your original assertion -- "the codes have four characters and four numerals" -- is (and always will be) true.

II. XSLT 1.0 (part 2)

A (potentially) safer solution would be to assume that there might be numerous non-numeric characters in various positions within the <i> nodes. In that case, this XSLT:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
  <xsl:output omit-xml-declaration="no" indent="yes" />
  <xsl:strip-space elements="*" />

  <xsl:variable name="vNums" select="'1234567890'" />

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*" />
    </xsl:copy>
  </xsl:template>

  <xsl:template match="/*">
    <t>
      <xsl:apply-templates>
        <xsl:sort select="translate(., translate(., $vNums, ''), '')"
          data-type="number" />
      </xsl:apply-templates>
    </t>
  </xsl:template>
</xsl:stylesheet>

...provides the same result:

<?xml version="1.0" encoding="utf-8"?>
<t>
  <i>COMP2100</i>
  <i>LAWS2202</i>
  <i>LAWS2250</i>
  <i>COMP2400</i>
  <i>COMP3410</i>
</t>

Explanation:

The Identity Transform is once again used.
In this case, the additional template uses the so-called Double Translate Method (first proposed by Michael Kay and first shown to me by Dimitre Novatchev) to remove all non-numeric characters from the value of each <i> element before sorting.

III. XSLT 2.0 Solution

Here's a possible XSLT 2.0 solution is very similar to part 2 of the XSLT 1.0 solution; it merely replaces the Double Translate Method with XPath 2.0's ability to handle regular expressions:

<xsl:sort select="replace(., '[^\d]', '')" data-type="number" />

Note that by no means are you required to use regular expressions in XPath 2.0; the Double Translate Method works just as well as in XPath 1.0. The replace() method will, however, most likely be more efficient.

Hey ABach thanks for the information! I have provided my XML and XSL can you please tell me which one would work best in terms of efficiency here? thank you! — Jane Doe, Sep 16 '12 at 20:07
@JaneDoe - no problem! Given that you are using XSLT 1.0, it is likely that XSLT 1.0 Solution #1 will be most efficient. However, unless you are dealing with very large documents, XSLT 1.0 Solution #1 and XSLT 1.0 Solution #2 will be very similar. My recommendation is to pick based on what your future needs will be; again, XSLT 1.0 Solution #2 is a bit more foolproof. — ABach, Sep 16 '12 at 20:09
@ABach, Your XSLT 2.0 solution has obvious error -- please, always verify your solutions by actually running them on representative data and confirming they produce the correct result. Also, the XSLT 1.0 solution isn't applicable to the specific XML document structure provided by the OP. — Dimitre Novatchev, Sep 16 '12 at 20:35
What about for something like this: http://stackoverflow.com/questions/33372683/how-to-sort-entries-of-xml-using-xslt?noredirect=1#comment54539949_33372683 (It is part of a smartform so I cannot have multiple same field value) — Si8, Oct 27 '15 at 16:04

Dimitre Novatchev · Answer 2 · 2012-10-26T13:36:52.793

There are two obvious errors in the provided XSLT code:

The namespace used to select elements is different from the default namespace of the provided XML document. Just change: xmlns:xsi="file://Volumes/xxxxxxx/Assignment" to xmlns:xsi="file://Volumes/xxxxxxx/Assignment".
The sort at present is not numeric. Change:

<xsl:sort select="xsi:code" order="ascending" />

to:

   <xsl:sort select="substring(xsi:code, 5)" data-type="number" />

The complete transformation becomes:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:xs="http://www.w3.org/2001/XMLSchema"
 xmlns:fn="http://www.w3.org/2005/xpath-functions"
 xmlns:xsi="file://Volumes/u4783938/Assignment">
<xsl:template match="/">
    <html>
    <head>
        <title> Course Catalogue </title>
    </head>
    <body bgcolor="#FF9999">
        <h1> <div style="text-align:center"> Course Catalogue </div> </h1>
        <xsl:for-each select="xsi:catalogue/xsi:course">
        <xsl:sort select="substring(xsi:code, 5)"
         data-type="number" />
        <div style="width:1000px;margin-bottom:4px;color:white;background-color:#F36;text-align:justify;border:outset;margin-left:auto;margin-right:auto;">
            <xsl:apply-templates select="xsi:code" />
            <br />
            <xsl:apply-templates select="xsi:title" />
            <br />
            <xsl:apply-templates select="xsi:year" />
            <br />
            <xsl:apply-templates select="xsi:science" />
            <br />
            <xsl:apply-templates select="xsi:area" />
            <br />
            <xsl:apply-templates select="xsi:subject" />
            <br />
            <xsl:apply-templates select="xsi:updated" />
            <br />
            <xsl:apply-templates select="xsi:unit" />
            <br />
            <xsl:apply-templates select="xsi:description" />
            <br />
            <xsl:apply-templates select="xsi:outcomes" />
            <br />
            <xsl:apply-templates select="xsi:incompatibility" />
        </div>
        </xsl:for-each>
    </body>
    </html>
</xsl:template>
</xsl:stylesheet>

and when applied on this XML document:

<catalogue xmlns="file://Volumes/u4783938/Assignment"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="file://Volumes/u4443554/Assignment/courses.xsd">
    <course>
        <code>ABCD3410</code>
        <title> Information Technology in Electronic Commerce </title>
        <year>later year</year>
        <science>C</science>
        <area> Research School of Computer Science </area>
        <subject> Computer Science </subject>
        <updated>2012-03-13T13:12:00</updated>
        <unit>6</unit>
        <description>Tce </description>
        <outcomes>Up trCommerce. </outcomes>
        <incompatibility>COMP1100</incompatibility>
    </course>
    <course>
        <code>COMP2011</code>
        <title> Course 2011 </title>
        <year>Year 2011</year>
        <science>C++</science>
        <area> Research School of Computer Science </area>
        <subject> Computer Science </subject>
        <updated>2012-03-13T13:12:00</updated>
        <unit>6</unit>
        <description>Tce </description>
        <outcomes>Up trCommerce. </outcomes>
        <incompatibility>COMP1100</incompatibility>
    </course>
</catalogue>

the produced result is now correctly sorted by the numeric part of the course code:

<html xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions" xmlns:xsi="file://Volumes/u4783938/Assignment">
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

      <title> Course Catalogue </title>
   </head>
   <body bgcolor="#FF9999">
      <h1>
         <div style="text-align:center"> Course Catalogue </div>
      </h1>
      <div style="width:1000px;margin-bottom:4px;color:white;background-color:#F36;text-align:justify;border:outset;margin-left:auto;margin-right:auto;">COMP2011<br> Course 2011 <br>Year 2011<br>C++<br> Research School of Computer Science <br> Computer Science <br>2012-03-13T13:12:00<br>6<br>Tce <br>Up trCommerce. <br>COMP1100
      </div>
      <div style="width:1000px;margin-bottom:4px;color:white;background-color:#F36;text-align:justify;border:outset;margin-left:auto;margin-right:auto;">ABCD3410<br> Information Technology in Electronic Commerce <br>later year<br>C<br> Research School of Computer Science <br> Computer Science <br>2012-03-13T13:12:00<br>6<br>Tce <br>Up trCommerce. <br>COMP1100
      </div>
   </body>
</html>

xsl:sort: sorting by numeric value

2 Answers2