I have strings inside XML that contain low-order ASCII. Specifically, the EDI contains special characters like char(28), char(29), char(30) (aka File Separator/Group Separator/Record Separator). The way the devs encode it in the XML string, these characters get changed to "", where the x1C is the hex representation (char(28), file separator). We then use a CLR to convert it back to the original text to store in a table, with the low-ascii characters in it (we don't store the XML, just the data within, as fields)
However, I need to do this in Azure SQL DB, which doesn't support CLR.
I'm unsure if the best/fastest way to do this is to write the XQuery to extract it in the correct form, or just use XQuery "value" and post-process it with T-SQL: nested REPLACE, CROSS/OUTER APPLY with REPLACE, or even possibly STUFF (TRANSLATE won't work because it's multiple characters).
Looking online, I can see where other versions of XQuery support fn:replace or bin:decode-string, but it doesn't appear that SQL Server's XQuery supports these methods, so I'm trying to figure out how I can do this ("replace value of", maybe?).
DECLARE @xml_edi XML = '<Bundle><RawData>ABCDE&#x1E;&#x1C;FGHIJK&#x1D;LMNOP</RawData></Bundle>'
SELECT x.y.value('./RawData[1]','varchar(max)')
from @xml_edi.nodes('/Bundle')x(y)
Expected results:
ABCDEFGHIJKLMNOP --note that this shows all 3 as the same character, that's a limitation of the browser/editor
(ABCDE, then char(30) & char(28), then FGHIJK, then the low-ascii character 29, then LMNOP)
Some links I found while writing this:
- Function to convert hex to ascii in xquery
- SQL Nested Replace
- https://www.sqlshack.com/overview-of-the-sql-replace-function/ (TRANSLATE)
- https://dba.stackexchange.com/questions/201851/replace-character-without-using-looped-replace-function (interesting use of STUFF)
- https://bertwagner.com/posts/how-to-eliminate-ugly-nested-replace-functions/ (CROSS APPLY)