0

I have a xml file that looks like the following structure:

<root>
 <companies>
  <company>
   <name>XYZ Co. </name>
   <addresses>
    <address>
     <street>12 Light Street</street>
    </address>
    <address>
     <street>44 King Street</street>
     <contacts>
      <contact>
        <fullName>Bob</fullName>
      </contact>
      <contact>
        <fullName>Alice</fullName>
      </contact>
     </contacts>
    </address>
   </addresses>
  </company>
 </companies>
 ...
 ...
</root>

So there could be many companies with addresses and within the addresses there could be many contacts. Now I am trying to query this xml to do inserts into the related tables (T-SQL). The tables are designed as follows:

  • Companies (Id, Name),
  • Addresses (AddressId, CompanyId, Street)
  • Contacts (ContactId, AddressId, FullName)

I have seen a similar question : Bulk insert nested xml with foreign key as identity column of first table, but that relied on one of the parent fields to be unique.

I am thinking of going down the cursor route.. hopefully theres a better option. Any help would be greatly appreciated.

Many thanks

Community
  • 1
  • 1
skub
  • 2,266
  • 23
  • 33
  • How are you going to map your entities to XML to entities in the database? You don't have an ID in the xml, will you rely on Names? If so, are they guaranteed to be unique? – ironstone13 Mar 30 '16 at 12:17
  • Thats whats I am struggling with at the moment. In the database, the primary key Ids are auto- incremented. The only way I would know about the parent->children relationships is from that xml structure. Also, the company names are not unique... – skub Mar 30 '16 at 12:26
  • I just edited my answer to reflect your additions. The query will generate IDs which you can use for your inserts... – Shnugo Mar 30 '16 at 12:47
  • 1
    The accepted answer you've linked uses `FROM OPENXML` with the prepare and remove procedures to call before and after your work. This is outdated... – Shnugo Mar 30 '16 at 13:30

1 Answers1

2

EDIT: With generated IDs

This solution relys on your XML being loaded into a variable @xml of type XML. If you need help with this, just call ;-)

WITH MyCompanies AS
(
    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS CompID
          ,Comp.value('name[1]','varchar(max)') AS Comp_Name
          ,Comp.query('.') AS Comp_InnerXML
    FROM @xml.nodes('/root/companies/company') AS A(Comp)
)
,TheAddresses AS
(
    SELECT MyCompanies.*
          ,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS AddrID
          ,Addr.value('street[1]','varchar(max)') AS Addr_Street 
          ,Addr.query('.') AS Addr_InnerXML
    FROM MyCompanies
    OUTER APPLY Comp_InnerXML.nodes('company/addresses/address') AS A(Addr) 
)
SELECT TheAddresses.*
      ,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS ContID
      ,Cont.value('fullName[1]','varchar(max)') AS Cont_FullName 
INTO #temp
FROM TheAddresses
OUTER APPLY Addr_InnerXML.nodes('address/contacts/contact') AS C(Cont);

SELECT * FROM #temp;

The result

+--------+-----------+--------+-----------------+--------+---------------+
| CompID | Comp_Name | AddrID | Addr_Street     | ContID | Cont_FullName |
+--------+-----------+--------+-----------------+--------+---------------+
| 1      | XYZ Co.   | 1      | 12 Light Street | 1      | NULL          |
+--------+-----------+--------+-----------------+--------+---------------+
| 1      | XYZ Co.   | 2      | 44 King Street  | 2      | Bob           |
+--------+-----------+--------+-----------------+--------+---------------+
| 1      | XYZ Co.   | 2      | 44 King Street  | 3      | Alice         |
+--------+-----------+--------+-----------------+--------+---------------+
| 2      | ABC Co.   | 3      | 12 ABC Street   | 4      | NULL          |
+--------+-----------+--------+-----------------+--------+---------------+
| 2      | ABC Co.   | 4      | 44 ABC Street   | 5      | Bob ABC       |
+--------+-----------+--------+-----------------+--------+---------------+
| 2      | ABC Co.   | 4      | 44 ABC Street   | 6      | Alice ABC     |
+--------+-----------+--------+-----------------+--------+---------------+

Within #temp you have the data. Use DISTINCT to do your inserts from there...

--Your companies
SELECT DISTINCT CompID,Comp_Name FROM #temp;
--Your address
SELECT DISTINCT CompID,AddrID,Addr_Street  FROM #temp;
--Your contacts
SELECT DISTINCT AddrID,ContID,Cont_FullName  FROM #temp;

previous

This query will get the data into table data.

But: You should insert this into related tables (company,address,contacts) rather an all in one.

You can create dummy IDs using ROW_NUMBER with OVER(PARTITION BY ...) to generate dummy IDs but this will work only in the first run.

SELECT Comp.value('name[1]','varchar(max)') AS Comp_Name
      ,Addr.value('street[1]','varchar(max)') AS Addr_Street 
      ,Cont.value('fullName[1]','varchar(max)') AS Cont_FullName 
FROM @xml.nodes('/root/companies/company') AS A(Comp)
OUTER APPLY Comp.nodes('addresses/address') AS B(Addr) 
OUTER APPLY Addr.nodes('contacts/contact') AS C(Cont)

The result

Comp_Name   Addr_Street     Cont_FullName
XYZ Co.     12 Light Street NULL
XYZ Co.     44 King Street  Bob
XYZ Co.     44 King Street  Alice
ABC Co.     12 ABC Street   NULL
ABC Co.     44 ABC Street   Bob ABC
ABC Co.     44 ABC Street   Alice ABC
Shnugo
  • 66,100
  • 9
  • 53
  • 114
  • You are a legend! This works - thanks heaps. If I could buy you a beer over the internet, I would :) – skub Mar 30 '16 at 23:30