19

How can I find the order of nodes in an XML document?

What I have is a document like this:

<value code="1">
    <value code="11">
        <value code="111"/>
    </value>
    <value code="12">
        <value code="121">
            <value code="1211"/>
            <value code="1212"/>
        </value>
    </value>
</value>

and I'm trying to get this thing into a table defined like

CREATE TABLE values(
    code int,
    parent_code int,
    ord int
)

Preserving the order of the values from the XML document (they can't be ordered by their code). I want to be able to say

SELECT code 
FROM values 
WHERE parent_code = 121 
ORDER BY ord

and the results should, deterministically, be

code
1211
1212

I have tried

SELECT 
    value.value('@code', 'varchar(20)') code, 
    value.value('../@code', 'varchar(20)') parent, 
    value.value('position()', 'int')
FROM @xml.nodes('/root//value') n(value)
ORDER BY code desc

But it doesn't accept the position() function ('position()' can only be used within a predicate or XPath selector).

I guess it's possible some way, but how?

shA.t
  • 16,580
  • 5
  • 54
  • 111
erikkallen
  • 33,800
  • 13
  • 85
  • 120

6 Answers6

42

You can emulate the position() function by counting the number of sibling nodes preceding each node:

SELECT
    code = value.value('@code', 'int'),
    parent_code = value.value('../@code', 'int'),
    ord = value.value('for $i in . return count(../*[. << $i]) + 1', 'int')
FROM @Xml.nodes('//value') AS T(value)

Here is the result set:

code   parent_code  ord
----   -----------  ---
1      NULL         1
11     1            1
111    11           1
12     1            2
121    12           1
1211   121          1
1212   121          2

How it works:

  • The for $i in . clause defines a variable named $i that contains the current node (.). This is basically a hack to work around XQuery's lack of an XSLT-like current() function.
  • The ../* expression selects all siblings (children of the parent) of the current node.
  • The [. << $i] predicate filters the list of siblings to those that precede (<<) the current node ($i).
  • We count() the number of preceding siblings and then add 1 to get the position. That way the first node (which has no preceding siblings) is assigned a position of 1.
Michael Liu
  • 52,147
  • 13
  • 117
  • 150
  • 4
    I used this code on a rather large XML file and because the `for $i in . return count(../*[. << $i]) + 1` part traverses all "sibling" nodes preceding every node this took forever (we let it run at work while going home, it had crashed the next day). So be forewarned that this code has an O(n^2) efficiency. – asontu Feb 03 '15 at 09:42
6

SQL Server's row_number() actually accepts an xml-nodes column to order by. Combined with a recursive CTE you can do this:

declare @Xml xml = 
'<value code="1">
    <value code="11">
        <value code="111"/>
    </value>
    <value code="12">
        <value code="121">
            <value code="1211"/>
            <value code="1212"/>
        </value>
    </value>
</value>'

;with recur as (
    select
        ordr        = row_number() over(order by x.ml),
        parent_code = cast('' as varchar(255)),
        code        = x.ml.value('@code', 'varchar(255)'),
        children    = x.ml.query('./value')
    from @Xml.nodes('value') x(ml)
    union all
    select
        ordr        = row_number() over(order by x.ml),
        parent_code = recur.code,
        code        = x.ml.value('@code', 'varchar(255)'),
        children    = x.ml.query('./value')
    from recur
    cross apply recur.children.nodes('value') x(ml)
)
select *
from recur
where parent_code = '121'
order by ordr

As an aside, you can do this and it'll do what do you expect:

select x.ml.query('.')
from @Xml.nodes('value/value')x(ml)
order by row_number() over (order by x.ml)

Why, if this works, you can't just order by x.ml directly without row_number() over is beyond me.

asontu
  • 4,548
  • 1
  • 21
  • 29
5

You can get the position of the xml returned by a x.nodes() function like so:

row_number() over (order by (select 0))

For example:

DECLARE @x XML
SET @x = '<a><b><c>abc1</c><c>def1</c></b><b><c>abc2</c><c>def2</c></b></a>'

SELECT
    b.query('.'),
    row_number() over (partition by 0 order by (select 0))
FROM
    @x.nodes('/a/b') x(b)
Ben
  • 34,935
  • 6
  • 74
  • 113
3

The answer by erikkallen is absolutely correct.

However, if the original document/schema may be modified, an alternative is to store the position/index in an attribute. I use a mix of both approaches, depending who the "originator" of the XML is and the type of queries that need to be performed upon it. At the end of the day I rue most use of XML except possibly "dumb storage" in SQL Server and am usually happy when I can dump it (XML) for normalized tables.

Happy dealing with the unmentioned limitations of "enterprise-grade" products -- the wonders never end.

Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
2

According to this document and this connect entry it is not possible, but the Connect entry contains two workarounds.

I do it like this:

WITH n(i) AS (SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9),
     o(i) AS (SELECT n3.i * 100 + n2.i * 10 + n1.i FROM n n1, n n2, n n3)
SELECT v.value('@code', 'varchar(20)') AS code,
       v.value('../@code', 'varchar(20)') AS parent,
       o.i AS ord
  FROM o
 CROSS APPLY @xml.nodes('/root//value[sql:column("o.i")]') x(v)
 ORDER BY o.i
erikkallen
  • 33,800
  • 13
  • 85
  • 120
  • 1
    Every-time I try to see if there is a good way to perform this I always feel like crying. It's the only way I have found (actually, I use a numbers table, but same ugly hack) -- it's an absolutely pathetic excuse for a server which "supports XML" and makes simple shredding and access much more complicated than it needs to be. –  Jan 12 '11 at 06:07
0

I see answer by @Ben and... get new sollution

 row_number() over (order by (select null))

as

  SELECT value.value('@code', 'varchar(20)') code, 
  value.value('../@code', 'varchar(20)') parent, 
  row_number() over (order by (select null))
  FROM @xml.nodes('/root//value') n(value)
nick_n_a
  • 198
  • 15