0

I need to insert rows into database table from XML file. My XML format is like below :

<Main>
    <Parent>
        <Title>Title1</Title>
        <Code>ABC123</Code>
        <Name>name1</Name>
        <company>test1</company>
        <Children>
            <Child>
                <Title>t1</Title>               
                <ContentType>T1</ContentType>
                <TimeStarted>2018-03-01T10:47:46</TimeStarted>
                <TimeFinished>2018-03-01T10:48:08</TimeFinished>
            </Child>
            <Child>
                <Title>t2</Title>               
                <ContentType>T1</ContentType>
                <TimeStarted>2018-03-01T10:47:46</TimeStarted>
                <TimeFinished>2018-03-01T10:48:08</TimeFinished>    
            </Child>
        </Children>
    </Parent>
    <Parent>
        <Title>Title2</Title>
        <Code>def123</Code>
        <Name>name2</Name>
        <company>test2</company>
        <Children>
            <Child>
                <Title>t1</Title>               
                <ContentType>T1</ContentType>
                <TimeStarted>2018-03-01T10:47:46</TimeStarted>
                <TimeFinished>2018-03-01T10:48:08</TimeFinished>
            </Child>
            <Child>
                <Title>t2</Title>               
                <ContentType>T1</ContentType>
                <TimeStarted>2018-03-01T10:47:46</TimeStarted>
                <TimeFinished>2018-03-01T10:48:08</TimeFinished>
            </Child>
        </Children>
    </Parent>
</Main>

I need to insert parent level data (under parent tag - Title,Code,Name,company) in Table1 and Children level data in Table2. Table2 is having foriengy key reference to Table1.(which we need to get using scope identity once inserted parent record.)

This can be very large xml file. Main problem is I need to insert records one by one, means in loop as if first Parent has some issue then the process should not be aborted but should log error and proceed to second or next Parent tag.

I tried searching best approach but mostly all suggestions goes for not iterating through xml and direct insert to tables. Below are some references that I tried at.

TSQL Inserting records from XML string

How to get individual identity in an XML Insert?

I need suggestion that for particularly my requirement can be achieved without looping through? Also which would be best approach to do this, like first insert into some temp table and then process it or using xquery first fetch all records in flat data and then cursor over it?? or any other approach

Please suggest.

vibs
  • 115
  • 2
  • 14
  • XML support is **highly vendor-specific** - so please add a tag to specify whether you're using `mysql`, `postgresql`, `sql-server`, `oracle` or `db2` - or something else entirely. – marc_s Mar 27 '18 at 05:23
  • I really hope, that your xml does not use `` and `` but rather `` and ``... Same with children... You've tagged this with `[tsql]`. This is narrowing the choice, but still not clear... – Shnugo Mar 27 '18 at 10:51
  • Is the FK built through the given `` (which must be unique in this case)? – Shnugo Mar 27 '18 at 10:52
  • sorry I missed your comments...it was not , etc. that was just . same with the child element...I wrote it by mistake... Updated question. Thanks. – vibs Mar 29 '18 at 05:01

2 Answers2

1

In general it is a good idea to use a two-stepped import, especially in cases, where you expect issues.

If this is SQL-Server (I take this form the [tsql] tag) you might create a STORED PROCEDURE accepting an XML as parameter and using the following code to shred this into a staging table.

Hint: I used the * to read your nodes <Parent1>, <Parent2>. I hope, that you do not have name-numbered elements though. But it would work anyway...

DECLARE @xml XML=N'Your xml here';

WITH Parents AS
(
    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS ParentID
          ,prnt.value(N'(Title)[1]','nvarchar(max)') AS Title
          ,prnt.value(N'(Code)[1]','nvarchar(max)') AS Code
          ,prnt.value(N'(Name)[1]','nvarchar(max)') AS Name
          ,prnt.value(N'(company)[1]','nvarchar(max)') AS Company
          ,prnt.query(N'Children/*') AS ChildrenXML
    FROM @xml.nodes(N'/Main/*') AS Lvl1(prnt)
)
SELECT   Parents.ParentID
        ,Parents.Title
        ,Parents.Code
        ,Parents.Name
        ,Parents.Company
        ,ROW_NUMBER() OVER(PARTITION BY ParentID ORDER BY (SELECT NULL)) AS ChildID
        ,chld.value(N'(Title)[1]','nvarchar(max)') AS Child_Title
        ,chld.value(N'(ContentType)[1]','nvarchar(max)') AS Child_ContentType
        ,chld.value(N'(TimeStarted)[1]','nvarchar(max)') AS Child_TimeStarted
        ,chld.value(N'(TimeFinished)[1]','nvarchar(max)') AS Child_TimeFinished
INTO #StagingTable
FROM Parents
OUTER APPLY ChildrenXML.nodes(N'*') AS Lvl2(chld);

SELECT * FROM #StagingTable;

The result

+----------+--------+--------+-------+---------+---------+-------------+-------------------+---------------------+---------------------+
| ParentID | Title  | Code   | Name  | Company | ChildID | Child_Title | Child_ContentType | Child_TimeStarted   | Child_TimeFinished  |
+----------+--------+--------+-------+---------+---------+-------------+-------------------+---------------------+---------------------+
| 1        | Title1 | ABC123 | name1 | test1   | 1       | t1          | T1                | 2018-03-01T10:47:46 | 2018-03-01T10:48:08 |
+----------+--------+--------+-------+---------+---------+-------------+-------------------+---------------------+---------------------+
| 1        | Title1 | ABC123 | name1 | test1   | 2       | t2          | T1                | 2018-03-01T10:47:46 | 2018-03-01T10:48:08 |
+----------+--------+--------+-------+---------+---------+-------------+-------------------+---------------------+---------------------+
| 2        | Title2 | def123 | name2 | test2   | 1       | t1          | T1                | 2018-03-01T10:47:46 | 2018-03-01T10:48:08 |
+----------+--------+--------+-------+---------+---------+-------------+-------------------+---------------------+---------------------+
| 2        | Title2 | def123 | name2 | test2   | 2       | t2          | T1                | 2018-03-01T10:47:46 | 2018-03-01T10:48:08 |
+----------+--------+--------+-------+---------+---------+-------------+-------------------+---------------------+---------------------+

This is very tolerant. All target columns are NVARCHAR(MAX), all parent rows are numbered, all children rows are numbered internally.

The second step - shifting this into your target tables - can be done easily from here. You can include any kind of evaluation, logging and/or error handling.

Shnugo
  • 66,100
  • 9
  • 53
  • 114
  • Awesome !!! Thanks for your answer. Sorry for late reply. Ya I moved to stored proc only and tried to process xml there. I tried to create two temp table, one with Parent level data and other with parent+child data(flat record). I was able to insert rows but for some scenarios it gave me unique constraint errors (as part of my requirement) and failed for that scenarios. I was looking for giving identity (Id column to parent and child) to resolve that stuff and here is your answer giving me what I needed !! I will try out your solution.Thanks a lot !! – vibs Mar 28 '18 at 05:58
  • It worked perfectly !! Thanks again !!! Hasn't expected this much clean answer while posting question :) – vibs Mar 29 '18 at 04:11
0

I put results into a datatable. You can either replace the write to the datatable with writes to the database. Ir at end of code read from datatable and then write to database. It depend on how large you database the method you want to use.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
using System.Data;

namespace ConsoleApplication1
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.xml";
        static void Main(string[] args)
        {
            DataTable dt = new DataTable();
            dt.Columns.Add("Parent", typeof(string));
            dt.Columns.Add("Parent Title", typeof(string));
            dt.Columns.Add("Code", typeof(string));
            dt.Columns.Add("Name", typeof(string));
            dt.Columns.Add("Company", typeof(string));

            dt.Columns.Add("Child", typeof(string));
            dt.Columns.Add("Child Title", typeof(string));
            dt.Columns.Add("Content Type", typeof(string));
            dt.Columns.Add("Time Start", typeof(DateTime));
            dt.Columns.Add("Time Finish", typeof(DateTime));

            XDocument doc = XDocument.Load(FILENAME);

            List<XElement> parents = doc.Root.Elements().ToList();

            foreach (XElement parent in parents)
            {
                try
                {
                    string parentTagName = (string)parent.Name.LocalName;
                    string parentTitle = (string)parent.Element("Title");
                    string code = (string)parent.Element("Code");
                    string parentName = (string)parent.Element("Name");
                    string company = (string)parent.Element("company");
                    foreach (XElement child in parent.Element("Children").Elements())
                    {
                        try
                        {
                            string childTagName = child.Name.LocalName;
                            string childTitle = (string)child.Element("Title");
                            string contentType = (string)child.Element("ContentType");
                            DateTime timeStarted = (DateTime)child.Element("TimeStarted");
                            DateTime timeFinished = (DateTime)child.Element("TimeFinished");

                            dt.Rows.Add(new object[] {
                                parentTagName,
                                parentTitle,
                                code,
                                parentName,
                                company,
                                childTagName,
                                childTitle, 
                                contentType,
                                timeStarted,
                                timeFinished

                            });
                        }
                        catch (Exception ex)
                        {
                            Console.WriteLine(ex.Message);
                        }

                    }
                }

                catch (Exception ex)
                {
                    Console.WriteLine(ex.Message);
                }
            }


        }
    }
}
jdweng
  • 33,250
  • 2
  • 15
  • 20