2

Hope, someone out there can help me with SQL Query, I am trying to merge two XML column, I am currently using SQL Azure 2019.

First XML

<HOME>
    <VALIDITYLIST>
        <VALIDITY STATE="1">
            <VALIDITYTYPE>1</VALIDITYTYPE>
            <GROUPCODE>DEFAULT</GROUPCODE>
            <ENTRY/>
            <CARD>2</CAR>
            <GIFTAID/>
            <VARIABLERANGE>false</VARIABLERANGE>
            <DAYS>365</DAYS>
            <NOTOPERATING>false</NOTOPERATING>
            <VALIDITYLIST/>
            <YPERESTRICTIONLIST/>
            <METRALOCKERV2>
                <LOCKERITEMID/>
            </METRALOCKERV2>
            <REQUIREDVAREXPDATE/>
        </VALIDITY>
    </VALIDITYLIST>
</HOME>

Second XML

<HOME>
    <VALIDITYLIST>
        <VALIDITY STATE="1">
            <VALIDITYTYPE>1</VALIDITYTYPE>
            <GROUPCODE>DEFAULT</GROUPCODE>
            <GIFTAID/>
            <DYNAMICP/>
            <VALIDITYLIST>
                <VALIDITY STATE="1">
                    <VALIDITYTYPE>2</VALIDITYTYPE>
                    <EVENT>3</EVENT>
                    <ENTRYTYPE>2</ENTRYTYPE>
                    <NUMENTRY>1</NUMENTRY>
                </VALIDITY>
            </VALIDITYLIST>
        </VALIDITY>
    </VALIDITYLIST>
</HOME>

Somehow SQL merge two xmls and return only nodes where values are present. something like.

<HOME>
    <VALIDITYLIST>
        <VALIDITY>
            <VALIDITYTYPE>1</VALIDITYTYPE>
            <GROUPCODE>DEFAULT</GROUPCODE>
            <CARD>2</CAR>
            <VARIABLERANGE>false</VARIABLERANGE>
            <DAYS>365</DAYS>
            <NOTOPERATING>false</NOTOPERATING>
            <VALIDITYLIST>
                <VALIDITY>
                    <VALIDITYTYPE>2</VALIDITYTYPE>
                    <EVENT>3</EVENT>
                    <ENTRYTYPE>2</ENTRYTYPE>
                    <NUMENTRY>1</NUMENTRY>
                </VALIDITY>
            </VALIDITYLIST>
        </VALIDITY>
    </VALIDITYLIST>
</HOME> 

Thanks guys for sharing your thought [edited part below]

I still would like to handle within SQL. Since Nodes are fixed, i am thinking of creating two tables by reading individual XML's and then depending on the values, i'll construct new XML, now i am stuck on first part, reading the XML. This is what i come up and i should expect to see value of 1 and 2 when reading 2nd XML but returning NULL's, can you see what i am doing wrong here?

select @XML2.value('(/HOME/VALIDITYTYPE/node())[1]', 'nvarchar(max)') as VALIDITYTYPE
, @XML2.value('(/HOME/VALIDITYLIST/VALIDITYLIST/VALIDITYTYPE/node())[1]', 'nvarchar(max)') as VALIDITYTYPE
  • And what happens if a specific XML node is present in **both** XML documents, and has different values? – marc_s Aug 26 '23 at 15:53
  • Thanks marc_s, it would never happens in my case but if it does, i guess First XML will take priority over the second. – Declan Junior Aug 26 '23 at 16:28
  • 2
    Maybe someone will submit an answer that proves me wrong but probably going to be much easier to do this outside SQL Server in something like C# – Martin Smith Aug 26 '23 at 16:35
  • Thanks martin, This has to be carried out within SQL though, a schedule jobs will update the data on as soon as few data arrives in the table. – Declan Junior Aug 26 '23 at 16:37
  • -Martin Smith, Managed Instance in Azure. can you please confirm who i can use CLR? – Declan Junior Aug 26 '23 at 16:50
  • Yeah managed instance supports CLR so potentially you could use CLR. Maybe something from here https://stackoverflow.com/q/6045010/73226 - and if the XSLT transform approach works for you see https://stackoverflow.com/q/4943347/73226 – Martin Smith Aug 26 '23 at 16:54
  • @MartinSmith Fairly simple if you only have one node. – Charlieface Aug 27 '23 at 01:28

2 Answers2

3

Yes, this is possible in pure SQL, using XQuery and full-joining the two XMLs, then reconstructing it using FOR XML

SELECT *,
  NewXml = (
    SELECT
      VALIDITYLIST = (
        SELECT
          ISNULL(x1.query('.'), x2.query('.'))
        FROM
          t.Xml1.nodes('HOME/VALIDITYLIST/VALIDITY/*[.//text()]') x1(x1)
        FULL JOIN
          t.Xml2.nodes('HOME/VALIDITYLIST/VALIDITY/*[.//text()]') x2(x2)
          ON x2.x2.value('local-name(.)','nvarchar(30)') = x1.x1.value('local-name(.)','nvarchar(30)')
        FOR XML PATH(''), ROOT('VALIDITY'), TYPE
      )
    FOR XML PATH(''), ROOT('HOME'), TYPE
  )
FROM YourTable t;

The predicate *[.//text()] checks that the current node has any descendant text element at any depth.

We join by local-name(.) the name of the current node, and take the first column's result if available, otherwise the second. FOR XML does not add a node name if the column is unnamed.

This all assumes you always have exactly one HOME/VALIDITYLIST/VALIDITY node, otherwise it's more complicated.

db<>fiddle

Charlieface
  • 52,284
  • 6
  • 19
  • 43
2

One more way by using XQuery FLWOR expression.

It simulates a LEFT OUTER JOIN by using col1 XML column as a base.

It needs to be tested to cover all edge cases.

SQL

DECLARE @tbl TABLE (ID INT IDENTITY PRIMARY KEY, col1 XML, col2 XML);
INSERT @tbl (col1, col2) VALUES
(N'<HOME>
    <VALIDITYLIST>
        <VALIDITY STATE="1">
            <VALIDITYTYPE>1</VALIDITYTYPE>
            <GROUPCODE>DEFAULT</GROUPCODE>
            <ENTRY/>
            <CARD>2</CARD>
            <GIFTAID/>
            <VARIABLERANGE>false</VARIABLERANGE>
            <DAYS>365</DAYS>
            <NOTOPERATING>false</NOTOPERATING>
            <VALIDITYLIST/>
            <YPERESTRICTIONLIST/>
            <METRALOCKERV2>
                <LOCKERITEMID/>
            </METRALOCKERV2>
            <REQUIREDVAREXPDATE/>
        </VALIDITY>
    </VALIDITYLIST>
</HOME>',
N'<HOME>
    <VALIDITYLIST>
        <VALIDITY STATE="1">
            <VALIDITYTYPE>1</VALIDITYTYPE>
            <GROUPCODE>DEFAULT</GROUPCODE>
            <GIFTAID/>
            <DYNAMICP/>
            <VALIDITYLIST>
                <VALIDITY STATE="1">
                    <VALIDITYTYPE>2</VALIDITYTYPE>
                    <EVENT>3</EVENT>
                    <ENTRYTYPE>2</ENTRYTYPE>
                    <NUMENTRY>1</NUMENTRY>
                </VALIDITY>
            </VALIDITYLIST>
        </VALIDITY>
    </VALIDITYLIST>
</HOME>');

SELECT t.ID, x
    , x.query('<HOME><VALIDITYLIST><VALIDITY>
    {
        for $x in /root/HOME[1]/VALIDITYLIST/VALIDITY/*
        let $y := /root/HOME[2]/VALIDITYLIST/VALIDITY/*[local-name(.)=local-name($x)]
        return if ($x[.//text()]) then $x
            else $y
    }
    </VALIDITY></VALIDITYLIST></HOME>') AS result
FROM @tbl AS t
    CROSS APPLY (SELECT TRY_CAST('<root>' + 
        TRY_CAST(col1 AS NVARCHAR(MAX)) +
        TRY_CAST(col2 AS NVARCHAR(MAX)) +
    '</root>' AS XML)) AS t1(x);

Output

<HOME>
  <VALIDITYLIST>
    <VALIDITY>
      <VALIDITYTYPE>1</VALIDITYTYPE>
      <GROUPCODE>DEFAULT</GROUPCODE>
      <CARD>2</CARD>
      <GIFTAID />
      <VARIABLERANGE>false</VARIABLERANGE>
      <DAYS>365</DAYS>
      <NOTOPERATING>false</NOTOPERATING>
      <VALIDITYLIST>
        <VALIDITY STATE="1">
          <VALIDITYTYPE>2</VALIDITYTYPE>
          <EVENT>3</EVENT>
          <ENTRYTYPE>2</ENTRYTYPE>
          <NUMENTRY>1</NUMENTRY>
        </VALIDITY>
      </VALIDITYLIST>
    </VALIDITY>
  </VALIDITYLIST>
</HOME>
Yitzhak Khabinsky
  • 18,471
  • 2
  • 15
  • 21
  • Slightly different version https://dbfiddle.uk/XQPWCYyh do note that you will miss any nodes which are not present at all in `col1` – Charlieface Aug 27 '23 at 11:54
  • @Charlieface, thanks for the suggestion. I adjusted the answer accordingly. Overall, I was after `LEFT OUTER JOIN` simulation. – Yitzhak Khabinsky Aug 27 '23 at 13:30