0

I have an XML string:

<XML>
 <xml_line>
   <col1>1</col1>
   <col2>foo 1</col2>
 </xml_line>
 <xml_line>
   <col1>2</col1>
   <col2>foo 2</col2>
 </xml_line>    
</XML>

I am extracting data from that string (stored in @data_xml) by storing it in SQL Server table and parsing it:

-- create temp table, insert XML string 
CREATE TABLE table1 (data_xml XML)

INSERT table1 
    SELECT @data_xml

-- parse XML string into temp table
SELECT
    N.C.value('col1[1]', 'int') col1_name,
    N.C.value('col2[1]', 'varchar(31)') col2_name,
FROM 
    table1
CROSS APPLY 
    data_xml.nodes('//xml_line') N(C)

I would like to know if there is a generic way to accomplish the same without specifying column names (i.e. col1[1], col2[1])

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
goryef
  • 1,337
  • 3
  • 22
  • 37
  • 1
    SQL Server is declarative by design. Dynamic columns would require Dynamic SQL – John Cappelletti Feb 03 '23 at 16:39
  • 1
    Just to add to my comment, you can hash XML and create a hierarchy. This is probably more than what you are looking for, but it can be trimmed down. https://dbfiddle.uk/sE36tDoK – John Cappelletti Feb 03 '23 at 16:51
  • 1
    @JohnCappelletti Thanks. I saw approach here https://stackoverflow.com/questions/61755806/select-all-xml-nodes-from-xml-column. That i can work with. Seems similar to yours. But was hoping for a simpler solution. – goryef Feb 03 '23 at 16:53
  • 1
    Shnugo is the king of XML (honestly huge fan). That approach is a scaled down version of my function. You may also note that it does not support attributes or complex structures. Also I need to point out that both solutions, the columns are not dynamic. – John Cappelletti Feb 03 '23 at 17:03

1 Answers1

1

You can use something like:

SELECT
    N.C.value('let $i := . return count(//xml_line[. << $i]) + 1', 'int') as LineNumber,
    Item.Node.value('local-name(.)', 'varchar(max)') name,
    Item.Node.value('.', 'varchar(max)') value
FROM 
    table1
CROSS APPLY 
    data_xml.nodes('//xml_line') N(C)
CROSS APPLY 
    N.C.nodes('*') Item(Node)

To get:

LineNumber name value
1 col1 1
1 col2 foo 1
2 col1 2
2 col2 foo 2

See this db<>fiddle.

However, to spread columns horizontally, you will need to generate dynamic SQL after querying for distinct element names.

ADDENDUM: Here is an updated db<>fiddle that also shows a dynamic SQL example.

The above maps all values as VARCHAR(MAX). If you have NVARCHAR data you can make the appropriate changes. If you have a need to map specific columns to specific types, you will need to explicitly define and populate a name-to-type mapping table and incorporate that into the dynamic SQL logic. The same may be necessary if you prefer that the result columns be in a specific order.

ADDENDUM 2: This updated db<>fiddle now includes column type and ordering logic.

--------------------------------------------------
-- Extract column names
--------------------------------------------------

DECLARE @Names TABLE (name VARCHAR(100))

INSERT @Names   
SELECT DISTINCT Item.Node.value('local-name(.)', 'varchar(max)')
FROM table1
CROSS APPLY data_xml.nodes('//xml_line/*') Item(Node)

--SELECT * FROM @Names

--------------------------------------------------
-- Define column-to-type mapping
--------------------------------------------------

DECLARE @ColumnTypeMap TABLE ( ColumnName SYSNAME, ColumnType SYSNAME, ColumnOrder INT)
INSERT @ColumnTypeMap
VALUES
    ('col1', 'int', 1),
    ('col2', 'varchar(10)', 2)

DECLARE @ColumnTypeDefault SYSNAME = 'varchar(max)'

--------------------------------------------------
-- Define SQL Templates
--------------------------------------------------

DECLARE @SelectItemTemplate VARCHAR(MAX) =
'    , N.C.value(<colpath>, <coltype>) <colname>
'

DECLARE @SqlTemplate VARCHAR(MAX) =
'SELECT
    N.C.value(''let $i := . return count(//xml_line[. << $i]) + 1'', ''int'') as LineNumber
<SelectItems>
FROM 
    table1
CROSS APPLY 
    data_xml.nodes(''//xml_line'') N(C)
'

--------------------------------------------------
-- Expand SQL templates into SQL
--------------------------------------------------

DECLARE @SelectItems VARCHAR(MAX) = (
    SELECT STRING_AGG(SI.SelectItem, '')
              WITHIN GROUP(ORDER BY ISNULL(T.ColumnOrder, 999), N.Name)
    FROM @Names N
    LEFT JOIN @ColumnTypeMap T ON T.ColumnName = N.name
    CROSS APPLY (
        SELECT SelectItem = REPLACE(REPLACE(REPLACE(
            @SelectItemTemplate
            , '<colpath>', QUOTENAME(N.name + '[1]', ''''))
            , '<colname>', QUOTENAME(N.name))
            , '<coltype>', QUOTENAME(ISNULL(T.ColumnType, @ColumnTypeDefault), ''''))
    ) SI(SelectItem)
)

DECLARE @Sql VARCHAR(MAX) = REPLACE(@SqlTemplate, '<SelectItems>', @SelectItems)

--------------------------------------------------
-- Execute
--------------------------------------------------

SELECT DynamicSql = @Sql

EXEC (@Sql)

Result (with some additional data):

LineNumber col1 col2 bar foo
1 1 foo 1 null More
2 2 foo 2 Stuff null
T N
  • 4,322
  • 1
  • 5
  • 18
  • I've updated the above to include a fiddle that includes a dynamic SQL example. – T N Feb 03 '23 at 18:23