2

In SQL 2005, is there a way to convert the following xml into a table?

<root>
  <r>
    <data>"col1"</data>
    <data>"col2"</data>
    <data>"col3"</data>
  </r>
  <r>
    <data>"data1"</data>
    <data>""</data>
    <data>"data3"</data>
  </r>
  <r>
    <data>"data"</data>
    <data>"data"</data>
    <data>"data"</data>
  </r>
</root>

I want the output to be

col1 col2 col3
----------------
data      data3
data data data

The xml can have different number of columns so the solution needs to account for this.

Thanks in advance.

SausageFingers
  • 1,796
  • 5
  • 31
  • 52

3 Answers3

3
declare @xml xml
set @xml = 
'<root>
  <r>
    <data>"col1"</data>
    <data>"col2"</data>
    <data>"col3"</data>
  </r>
  <r>
    <data>"data1"</data>
    <data>""</data>
    <data>"data3"</data>
  </r>
  <r>
    <data>"data"</data>
    <data>"data"</data>
    <data>"data"</data>
  </r>
</root>'

declare @SQL nvarchar(max)
set @SQL = ''

select @SQL = @SQL + ',replace(r.r.value(''data['+
         cast(T.rn as nvarchar(10))+
         ']'', ''varchar(10)''), ''"'','''') as '+
         quotename(replace(T.ColName, '"', '')) 
from
(
  select
    r.r.value('.', 'sysname') as ColName,
    row_number() over(order by (select 1)) as rn
  from @xml.nodes('/root/r[1]/data') r(r)
) as T

set @SQL = 'select '+stuff(@SQL, 1, 1, '')+
        ' from @x.nodes(''/root/r[position()>1]'') r(r)'

exec sp_executesql @SQL, N'@x xml', @x = @xml

Since I use dynamic SQL here it is appropriate to suggest reading The Curse and Blessings of Dynamic SQL.

An explanation of what is going on.

This query is used to get the columns names from the first r node:

select
    r.r.value('.', 'varchar(10)') as ColName,
    row_number() over(order by (select 1)) as rn
  from @xml.nodes('/root/r[1]/data') r(r)

/root/r[1] makes sure we get the first row. row_number() enumerates the columns making a connection between a number and the column name.

The resulting query in @SQL is this:

select 
  replace(r.r.value('data[1]', 'varchar(10)'), '"','') as [col1],
  replace(r.r.value('data[2]', 'varchar(10)'), '"','') as [col2],
  replace(r.r.value('data[3]', 'varchar(10)'), '"','') as [col3] 
from @xml.nodes('/root/r[position()>1]') r(r)

/root/r[position()>1] gets all r nodes except the first one. The 1 in data[1] comes from row_number() and [col1] comes from the corresponding column name. quotename() adds the brackets [] to the column alias. Without quotename() this query could be used for SQL injection. replace() is used to remove " from the string. It will remove all occurrences of " so if you expect " to be part of a value you could use substring() to remove " instead.

I have used varchar(10) as the size of column data. You should modify that to whatever you need.

Mikael Eriksson
  • 136,425
  • 22
  • 210
  • 281
2

Not with a varying number of columns: SQL in general is fixed column

However, you can anticipate this somewhat

DECLARE @foo AS xml = '<root>
  <r>
    <data>"col1"</data>
    <data>"col2"</data>
    <data>"col3"</data>
  </r>
  <r>
    <data>"data1"</data>
    <data>""</data>
    <data>"data3"</data>
  </r>
  <r>
    <data>"data"</data>
    <data>"data"</data>
    <data>"data"</data>
  </r>
</root>'

SELECT
   REPLACE(x.item.value('(data)[1]', 'varchar(100)'), '"', '') AS col1,
   REPLACE(x.item.value('(data)[2]', 'varchar(100)'), '"', '') AS col2,
   REPLACE(x.item.value('(data)[3]', 'varchar(100)'), '"', '') AS col3,
   REPLACE(x.item.value('(data)[4]', 'varchar(100)'), '"', '') AS col4,
   REPLACE(x.item.value('(data)[5]', 'varchar(100)'), '"', '') AS col5,
   REPLACE(x.item.value('(data)[6]', 'varchar(100)'), '"', '') AS col6,
   REPLACE(x.item.value('(data)[7]', 'varchar(100)'), '"', '') AS col7,
   REPLACE(x.item.value('(data)[8]', 'varchar(100)'), '"', '') AS col8,
   REPLACE(x.item.value('(data)[9]', 'varchar(100)'), '"', '') AS col9,
   REPLACE(x.item.value('(data)[10]', 'varchar(100)'), '"', '') AS col10
FROM
   @foo.nodes('/root/r') x(item)

There is also no guarantee of node order evaluation by default which complicates extracting the first row column names.

Based on this answer:http://stackoverflow.com/q/1134075/27535, you can use this SQL to identify row 1

;WITH n(i) AS (SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9),
     o(i) AS (SELECT n3.i * 100 + n2.i * 10 + n1.i FROM n n1, n n2, n n3)
SELECT
   REPLACE(x.item.value('(data)[1]', 'varchar(100)'), '"', '') AS col1,
   REPLACE(x.item.value('(data)[2]', 'varchar(100)'), '"', '') AS col2,
   REPLACE(x.item.value('(data)[3]', 'varchar(100)'), '"', '') AS col3,
   REPLACE(x.item.value('(data)[4]', 'varchar(100)'), '"', '') AS col4,
   REPLACE(x.item.value('(data)[5]', 'varchar(100)'), '"', '') AS col5,
   REPLACE(x.item.value('(data)[6]', 'varchar(100)'), '"', '') AS col6,
   REPLACE(x.item.value('(data)[7]', 'varchar(100)'), '"', '') AS col7,
   REPLACE(x.item.value('(data)[8]', 'varchar(100)'), '"', '') AS col8,
   REPLACE(x.item.value('(data)[9]', 'varchar(100)'), '"', '') AS col9,
   REPLACE(x.item.value('(data)[10]', 'varchar(100)'), '"', '') AS col10,
   o.i
FROM
   o
   CROSS APPLY
   @foo.nodes('/root/r[sql:column("o.i")]') x(item)
gbn
  • 422,506
  • 82
  • 585
  • 676
0

I've never done it, but I know it can be done...

Take a look at this tutorial, it seems to be pretty close to exactly what you're asking for. http://weblogs.sqlteam.com/mladenp/archive/2007/06/18/60235.aspx

AllenG
  • 8,112
  • 29
  • 40
  • Thanks @AllenG, but my data has already been imported into a table as a nvarchar(Max), from what I gather BULK is best for importing large XML files from the file system. – SausageFingers May 20 '11 at 10:47