You can try the following three approaches.
Method #: 1
We will find out if there is at least one root level node.
By the way SQL Server allows not well-formed XML, i.e. just XML fragments without a root element. That's why I added that use case to the sample data population.
Also, I added an XML comment for completeness.
The outcome's meaning:
- 2 (or any number greater than 1) - an XML fragment
- 1 - well-formed XML
- 0 - no XML elements, text or comment nodes.
SQL
-- DDL and sample data population, start
DECLARE @TestTable TABLE (ID INT IDENTITY (1, 1) PRIMARY KEY, XmlColumn XML NOT NULL);
INSERT INTO @TestTable (XmlColumn) VALUES
(N'<root><child /></root>'),
(N'<city>Miami</city><city>Orlando</city>'),
(N'Foo, this is not XML'),
(N'<root><child /></root>Foo'),
(N'<!-- -->'),
(N'<root><parent><child /></parent></root>');
-- DDL and sample data population, end
SELECT *
, XmlColumn.value('count(/*)', 'INT') AS Result
FROM @TestTable;
Output
+----+-----------------------------------------+-----------+
| ID | XmlColumn | Result |
+----+-----------------------------------------+-----------+
| 1 | <root><child /></root> | 1 |
| 2 | <city>Miami</city><city>Orlando</city> | 2 |
| 3 | Foo, this is not XML | 0 |
| 4 | <!-- --> | 0 |
| 5 | <root><parent><child /></parent></root> | 1 |
+----+-----------------------------------------+-----------+
Method #: 2
By using an instance of element()
XQuery construct
SELECT *
, XmlColumn.query('<root>{
for $x in /*
return
if ($x instance of element()) then <r/> else ()
}</root>').value('count(/root/r)', 'INT') AS Result
, XmlColumn.query('for $x in .
return if ($x eq /*[1]) then "well-formed" else "not well- formed"').value('.','VARCHAR(20)') AS [well-formed]
FROM @TestTable;
Output
+----+-----------------------------------------+--------+-----------------+
| ID | XmlColumn | Result | well-formed |
+----+-----------------------------------------+--------+-----------------+
| 1 | <root><child /></root> | 1 | well-formed |
| 2 | <city>Miami</city><city>Orlando</city> | 2 | not well-formed |
| 3 | Foo, this is not XML | 0 | not well-formed |
| 4 | <root><child /></root>Foo | 1 | not well-formed |
| 5 | <!-- --> | 0 | not well-formed |
| 6 | <root><parent><child /></parent></root> | 1 | well-formed |
+----+-----------------------------------------+--------+-----------------+
Method #: 3
A complete solution. The algorithm is based on comparison of counts: count of any type of nodes vs. count of elements only. Additionally, it gives a breakdown of node types in the NodeList column for easy understanding of what is going on.
;WITH rs AS
(
SELECT *
, XmlColumn.value('count(/node())', 'INT') AS NodeCount -- any type of nodes
, XmlColumn.value('count(/*)', 'INT') AS ElementCount -- elements only
, XmlColumn.query('
for $x in /node()
return if ($x instance of element()) then text {"element()"}
else if ($x instance of text()) then text {"text()"}
else if ($x instance of comment()) then text {"comment()"}
else if ($x instance of processing-instruction()) then text {"processing-instruction()"}
else ()
') AS NodeList
FROM @TestTable
)
SELECT *
, CASE
WHEN NodeCount = 1 AND ElementCount = 1 THEN 'well-formed'
WHEN NodeCount = 2 AND ElementCount = 1
AND LEFT(NodeList, 24) = 'processing-instruction()' THEN 'well-formed'
WHEN NodeCount > 1 AND (ElementCount = NodeCount) THEN 'XML fragment'
WHEN NodeCount > ElementCount THEN 'not well-formed'
ELSE '???'
END AS Result
FROM rs;
Output
+----+-----------------------------------------+-----------+--------------+--------------------+-----------------+
| ID | XmlColumn | NodeCount | ElementCount | NodeList | Result |
+----+-----------------------------------------+-----------+--------------+--------------------+-----------------+
| 1 | <root><child /></root> | 1 | 1 | element() | well-formed |
| 2 | <city>Miami</city><city>Orlando</city> | 2 | 2 | element()element() | XML fragment |
| 3 | Foo, this is not XML | 1 | 0 | text() | not well-formed |
| 4 | <root><child /></root>Foo | 2 | 1 | element()text() | not well-formed |
| 5 | <!-- --> | 1 | 0 | comment() | not well-formed |
| 6 | <root><parent><child /></parent></root> | 1 | 1 | element() | well-formed |
+----+-----------------------------------------+-----------+--------------+--------------------+-----------------+