3

I have thousands of xml files to get data from. To achieve that I have used the cross apply method.

But the problem is, some nodes are not always present in the xml-file. In my example this is the node 'valueX' (inside the product node). And this node is only present in the last person.

 <invoice>
    <person>
        <name>John</name>
        <product>
            <id>abc</id>
            <price>100</price>

        </product>
        <product>
            <id>def</id>
            <price>99</price>

        </product>
    </person>

    <person>
        <name>Mary</name>
        <product>
            <id>abc</id>
            <price>200</price>

        </product>
    </person>

    <person>
        <name>Peter</name>
        <product>
            <id>abc</id>
            <price>300</price>

        </product>
    </person>

    <person>
        <name>Sue</name>
        <product>
            <id>abc</id>
            <price>400</price>
            <valueX>
                <name>test</name>
            </valueX>
        </product>
    </person>

</invoice>

If I query now the xml file with cross apply by using the 'valueX' node, I get only the one record that contains this node.

declare @tab table  (
                          id int
                        , xmlData xml
                    )

declare @xml nvarchar(max)

set @xml = '
                <invoice>
                    <person>
                        <name>John</name>
                        <product>
                            <id>abc</id>
                            <price>100</price>

                        </product>
                        <product>
                            <id>def</id>
                            <price>99</price>

                        </product>
                    </person>

                    <person>
                        <name>Mary</name>
                        <product>
                            <id>abc</id>
                            <price>200</price>

                        </product>
                    </person>

                    <person>
                        <name>Peter</name>
                        <product>
                            <id>abc</id>
                            <price>300</price>

                        </product>
                    </person>

                    <person>
                        <name>Sue</name>
                        <product>
                            <id>abc</id>
                            <price>400</price>
                            <valueX>
                                <name>test</name>
                            </valueX>
                        </product>
                    </person>

                </invoice>
            '


insert into @tab (id, xmlData)
values  (
              1
            , @xml
        )


select    t.id
        , Person.ref.value('name[1]','nvarchar(255)')   as PersonName
        , Product.ref.value('id[1]','nvarchar(3)')      as ProductID
        , Product.ref.value('price[1]','int')           as ProductPrice
        , ValueX.ref.value('name[1]','nvarchar(255)')   as ValueXName

from @tab as t

cross apply t.xmlData.nodes('invoice/person') Person(ref)
    cross apply Person.ref.nodes('product') Product(ref)
        cross apply Product.ref.nodes('valueX') ValueX(ref)    

But I need to have all records even when the node is not present.

How can I achieve that?

gbr
  • 33
  • 3

1 Answers1

1

Just change your last part to

from @tab as t

outer apply t.xmlData.nodes('invoice/person') Person(ref)
    outer apply Person.ref.nodes('product') Product(ref)
        outer apply Product.ref.nodes('valueX') ValueX(ref)   

CROSS APPLY is like INNER JOIN while OUTER APPLY is like LEFT JOIN

Shnugo
  • 66,100
  • 9
  • 53
  • 114