Replace multiple values in XML column based on a mapping table

Question

Suppose I have a mapping table called tblMap that just maps an old attribute ID to a new attribute ID (oldID -> newID). NOTEWORTHY: newID is not contained in the list of oldID's.

I then have a table tblData which contains an xml string that has a number of attribute ids. I would like to replace all the current attribute ids with the newIDs that are found in tblMap. If an id mapping is not found in tblMap then it should stay as is. Any hints on how I can achieve this?

WHAT I TRIED:

I was trying to coerce something using XMLText.modify('replace value of ...') as described in: This StackOverflow Article but haven't been successful at getting it to work.

CREATE TABLE tblmap (
  oldid INT, 
  newid INT
)
GO

INSERT INTO tblMap
VALUES
( 58, 1002),
( 85, 5002),
( 70, 3202),
(2, 2340),
(5, 7432)
GO

CREATE TABLE tblData ( [SourceID] int, [SourceRecID] bigint, [Value] xml )
GO

INSERT INTO tblData
VALUES
( 1, 0, N'<attributes><attribute id="58" value="0" /><attribute id="86" value="1" /><attribute id="85" value="1" /><attribute id="70" value="0" /><attribute id="38" value="0" /><attribute id="68" value="0" /><attribute id="42" value="1" /><attribute id="67" value="1" /><attribute id="62" value="1" /></attributes>' ), 
( 1, 686, N'<attributes><attribute id="1" value="0.25" /><attribute id="4" value="1" /><attribute id="10" value="3" /><attribute id="11" value="1" /><attribute id="12" value="6" /></attributes>' ), 
( 1, 687, N'<attributes><attribute id="1" value="2.00" /><attribute id="2" value="60.00" /><attribute id="3" value="-1" /><attribute id="5" value="252.00" /><attribute id="6" value="0" /><attribute id="7" value="1" /><attribute id="9" value="1" /><attribute id="10" value="1" /><attribute id="11" value="2" /><attribute id="12" value="10" /></attributes>' ), 
( 1, 688, N'<attributes><attribute id="1" value="2.00" /><attribute id="2" value="60.00" /><attribute id="3" value="-1" /><attribute id="5" value="252.00" /><attribute id="6" value="0" /><attribute id="7" value="1" /><attribute id="11" value="2" /><attribute id="12" value="10" /></attributes>' )


SELECT *
FROM tblMap
GO

SELECT *
FROM tblData
GO

I have constructed all the schema/sample data here for your convenience: https://rextester.com/MUMI61854

@Denis Now you have 2008 or 2012, I think I update your question according to your comment where you say 2008 no? — Ilyes, Dec 05 '18 at 19:08
@Sami please see above. I have 2012 but 2008 solutions will work just fine too. — Denis, Dec 05 '18 at 19:09

score 2 · Accepted Answer · edited Dec 06 '18 at 15:26

I would try to completely recreate the entire XML (or rather the /attributes node) and update the table with new value:

declare @tblmap table (oldid INT, newid INT);

INSERT INTO @tblMap
VALUES
( 58, 1002),
( 85, 5002),
( 70, 3202),
(2, 2340),
(5, 7432);

declare @tblData table ([SourceID] int, [SourceRecID] bigint, [Value] xml);

INSERT INTO @tblData
VALUES
( 1, 0, N'<attributes><attribute id="58" value="0" /><attribute id="86" value="1" /><attribute id="85" value="1" /><attribute id="70" value="0" /><attribute id="38" value="0" /><attribute id="68" value="0" /><attribute id="42" value="1" /><attribute id="67" value="1" /><attribute id="62" value="1" /></attributes>' ), 
( 1, 686, N'<attributes><attribute id="1" value="0.25" /><attribute id="4" value="1" /><attribute id="10" value="3" /><attribute id="11" value="1" /><attribute id="12" value="6" /></attributes>' ), 
( 1, 687, N'<attributes><attribute id="1" value="2.00" /><attribute id="2" value="60.00" /><attribute id="3" value="-1" /><attribute id="5" value="252.00" /><attribute id="6" value="0" /><attribute id="7" value="1" /><attribute id="9" value="1" /><attribute id="10" value="1" /><attribute id="11" value="2" /><attribute id="12" value="10" /></attributes>' ), 
( 1, 688, N'<attributes><attribute id="1" value="2.00" /><attribute id="2" value="60.00" /><attribute id="3" value="-1" /><attribute id="5" value="252.00" /><attribute id="6" value="0" /><attribute id="7" value="1" /><attribute id="11" value="2" /><attribute id="12" value="10" /></attributes>' );

SELECT * FROM @tblMap;
SELECT * FROM @tblData;

-- Update table with new XML
with cte as (
select d.*, (
    select isnull(m.newid, a.c.value('./@id', 'int')) as [@id], a.c.value('./@value', 'nvarchar(max)') as [@value]
    from d.Value.nodes('/attributes[1]/attribute') a(c)
        left join @tblmap m on m.oldid = a.c.value('./@id', 'int')
    for xml path('attribute'), type, root('attributes')
    ) as [NewValue]
from @tblData d
)
update c set Value = NewValue
from cte c;

-- New version
select * from @tblData;

(I have turned your tables into table variables, as it leaves zero footprint on the instance. Everything else is the same.)

Unfortunately, this approach can become prohibitively difficult to implement if your actual XML schema is more complex than your example shows, and involves additional unpredictable elements and / or attributes under the /attributes node. In that case, I would recommend either a FLWOR (which is slow and quite difficult to write, at least for me) or cursored update.

To DEBUG:

-- Update table with new XML
with cte as (
select d.*, (
    select isnull(m.newid, a.c.value('./@id', 'int')) as [@id], a.c.value('./@value', 'nvarchar(max)') as [@value]
    from d.Value.nodes('/attributes[1]/attribute') a(c)
        left join @tblmap m on m.oldid = a.c.value('./@id', 'int')
    for xml path('attribute'), type, root('attributes')
    ) as [NewValue]
from @tblData d
)
SELECT c.SourceID,
   c.SourceRecID,
   c.Value,
   c.NewValue
from cte c;

Great solution (+1 from my side)! Does exactly the same as mine on SQL-Server-level, while mine is processing the whole thing in `XQuery`. No time for this at the moment, but I'd be interested, which approach is faster... Have to test this later... — Shnugo, Dec 06 '18 at 09:34
I like this too. My XML is just what I posted, no different - thank goodness. Will give it a go. The only issue I see (just looking at this query) is that if the XML is something like `` then this will return "NULL" for the NewValue which my table will reject since Value is NOT NULL (realized that in my example it wasn't setup that way) but I can pre-process because there is no point to having those empty attributes stored anyway so I can delete them before this query. Great solution by the way! Very easy to debug as you can see the old value and new value side-by-side! — Denis, Dec 06 '18 at 15:23
@Denis, `null` results are very easy to circumvent - just replace the update line with `update c set Value = isnull(NewValue, N'')` and it should be enough (can't test right now). — Roger Wolf, Dec 06 '18 at 23:43

score 2 · Answer 2 · answered Dec 06 '18 at 09:32

My suggestion calls XQuery to the rescue (txh Roger Wolf for the declared table variables, used them too...):

declare @tblmap table (oldid INT, newid INT);

INSERT INTO @tblMap
VALUES
( 58, 1002),
( 85, 5002),
( 70, 3202),
(2, 2340),
(5, 7432);

declare @tblData table ([SourceID] int, [SourceRecID] bigint, [Value] xml);

INSERT INTO @tblData
VALUES
( 1, 0, N'<attributes><attribute id="58" value="0" /><attribute id="86" value="1" /><attribute id="85" value="1" /><attribute id="70" value="0" /><attribute id="38" value="0" /><attribute id="68" value="0" /><attribute id="42" value="1" /><attribute id="67" value="1" /><attribute id="62" value="1" /></attributes>' ), 
( 1, 686, N'<attributes><attribute id="1" value="0.25" /><attribute id="4" value="1" /><attribute id="10" value="3" /><attribute id="11" value="1" /><attribute id="12" value="6" /></attributes>' ), 
( 1, 687, N'<attributes><attribute id="1" value="2.00" /><attribute id="2" value="60.00" /><attribute id="3" value="-1" /><attribute id="5" value="252.00" /><attribute id="6" value="0" /><attribute id="7" value="1" /><attribute id="9" value="1" /><attribute id="10" value="1" /><attribute id="11" value="2" /><attribute id="12" value="10" /></attributes>' ), 
( 1, 688, N'<attributes><attribute id="1" value="2.00" /><attribute id="2" value="60.00" /><attribute id="3" value="-1" /><attribute id="5" value="252.00" /><attribute id="6" value="0" /><attribute id="7" value="1" /><attribute id="11" value="2" /><attribute id="12" value="10" /></attributes>' );

--The query will do the whole process in one single pass

WITH CombineThem AS
(
    SELECT d.SourceID
          ,d.SourceRecID
          ,d.[Value]
          ,(SELECT
               (SELECT * 
                FROM @tblMap 
                FOR XML PATH('map'),ROOT('maps'),TYPE)
              ,[Value] AS [*]
             FOR XML PATH('Combined'),TYPE) AS Combined
    FROM @tblData d
)
,updateableCTE AS
(
    SELECT ct.[Value]
          ,ct.Combined
           .query('<attributes>
                   {
                    for $attr in /Combined/attributes/attribute
                    return <attribute id="{
                                           (
                                            /Combined/maps/map[oldid[1]=$attr/@id]/newid
                                            ,$attr/@id
                                           )[1]
                                          }" 
                                      value="{$attr/@value}"/> 
                   }  
                   </attributes>') NewValue
    FROM CombineThem ct
)
UPDATE updateableCTE SET [Value]=NewValue;

--check the result

SELECT * FROM @tblData;

Some Explanation

In order to use the mapping and the data in XQuery I create a combined XML in the first CTE. This will include the full <attributes> element together with a <maps> element.

The .query() will run through the attributes and search the <maps> for a fitting re-mapping. The magic happens in (val1,val2)[1]. This behaves like COALESCE(). It will pick the first non-null-value, which is either the fitting new id or the existing value.

Instead of updating the XML using .modify() the final step will set the [Value] column to the newly created XML in one go.

Denis, FYI the approach used by @Shnugo is called "FLWOR" (pronounced "flavour"), I mentioned it in my answer. You can do (almost) anything with it, but the learning curve is steep indeed. Here is some description: https://learn.microsoft.com/en-us/sql/xquery/flwor-statement-and-iteration-xquery — Roger Wolf, Dec 06 '18 at 09:43

score 1 · Answer 3 · answered Dec 05 '18 at 19:30

1

Honestly, I'm not 100% the reliability on ORDER BY (SELECT NULL) here, however, I don't have many options other than to hope the order is that of the nodes.

Anyway, the solution involves dynamic SQL; there may be a "better" way of doing this, but if there is I don't know it. I suggest doing some decent testing first, however, this appears to get the results you're after:

DECLARE @SQL nvarchar(MAX);
SET @SQL = STUFF((SELECT NCHAR(10) +
                         N'UPDATE tblData' + NCHAR(10) + 
                         N'SET [Value].modify(''replace value of (/attributes/attribute/@id)[' + CONVERT(varchar(4),ROW_NUMBER() OVER (PARTITION BY D.SourceID, D.SourceRecID ORDER BY (SELECT NULL))) + N'] with "' + CONVERT(varchar(4),ISNULL(M.newid,V.AA.value('@id','int'))) + N'"'')' + NCHAR(10) +
                         N'WHERE SourceID = ' + CONVERT(varchar(4),D.SourceID) + NCHAR(10) +
                         N'  AND SourceRecID = ' + CONVERT(varchar(4),D.SourceRecID) + N';'
                  FROM tblData D
                       CROSS APPLY D.[Value].nodes('attributes/attribute') V(AA)
                       LEFT JOIN tblmap M ON V.AA.value('@id','int') = M.oldid
                  FOR XML PATH(N'')),1,1,N'');

EXEC sp_executesql @SQL;

answered Dec 05 '18 at 19:30

Thom A

88,727
11
45
75

I was thinking of some sort of a WHILE loop but I'll try to work on this one to see if it works. – Denis Dec 05 '18 at 19:37
I personally prefer to build the dynamic sql, over a `WHILE` loop. There are certainly some cases where I've found dynamic SQL, to build a large batch, far faster than a `WHILE`/`CURSOR`. – Thom A Dec 05 '18 at 19:41
I wonder if it would be better to use `replace value of (/attributes/attribute/@id)[.=58][1] with "1002)"` (equality comparison to old value in xml) – Denis Dec 05 '18 at 19:59
`@Age` Denis? You don't have an `@Age` element in your data. – Thom A Dec 05 '18 at 20:00
sorry I was looking at an idea in another SO article and copied/pasted too fast – Denis Dec 05 '18 at 20:01
Oh, in the link you posted before. The problem there, @Denis, is that you aren't replacing a literal value with another literal value; you're replacing a lookup value with another lookup value. The solution here relies on `ROW_NUMBER` getting the right order; if it doesn't then you'll need to check every node against every lookup value. I can certainly attempt to update the above to do that, but I won't get a chance to do so tonight; my home laptop isn't the easiest to work on with "complex" dynamic SQL (as Azure Data Studio doesn't provide good `PRINT` output, and I have a low res). – Thom A Dec 05 '18 at 20:05
no worries. Was just thinking about it. That might make order not so important. Feel free to post it as a separate answer. Would be interesting to compare just for personal education on how this can be done because your query is way above my paygrade so just want to pick your brain so I can learn more as you seem very knowledgeable in this area. – Denis Dec 05 '18 at 20:08

Replace multiple values in XML column based on a mapping table

3 Answers3

Some Explanation