Extract part of a big XML

Question

I have to extract a part of an XML. My XML file can contain thousands of nodes and I would like to get only a part of it and have this part as an xml string.

My XML structure:

<ResponseMessage xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
    <ErrorResponse>
        <Code>SUCCESS</Code>
        <Message>Success</Message>
    </ErrorResponse>
    <OutputXml>
        <Response>
            <Product>
                <child1>xxx</child1>
                <child2>xxx</child2>
                ...
            </Product>
            <Product>
                <child1>xxx</child1>
                <child2>xxx</child2>
                ...
            </Product>
            ...
        </Response>
    </OutputXML>
</ResponseMessage>

I'm getting the XML from a webservice like that:

...
System.Net.WebResponse wResponse = req.GetResponse();
reqstream = wResponse.GetResponseStream();
System.IO.StreamReader reader = new System.IO.StreamReader(reqstream);

System.Xml.Linq.XDocument xmlResponse = System.Xml.Linq.XDocument.Parse(reader.ReadToEnd());

Then I tried to put the XML in a generic collection to process it using linq:

int startIndex = 0;
int nbItem = 25;
System.Text.StringBuilder outputXml = new System.Text.StringBuilder();
System.Collections.Generic.IEnumerable<System.Xml.Linq.XElement> partialList =
   xmlResponse.Elements("Response").Skip(startIndex).Take(nbItem);

foreach (System.Xml.Linq.XElement x in partialList)
{
    outputXml.Append(x.ToString());
}

My problem is that my list is always empty.

Note that LINQ XDocument.Parse will construct objects in memory for the entire XML document, even if you're only interested in part of it. If your XML document is so incredibly huge that its object representation won't comfortably fit in available RAM, then you should use the older System.Xml.XmlTextReader to step through the XML tokens as a stream (not constructing objects) until you reach the data you're interested in. It's laborious code, but memory-thrifty when you're looking for a needle in gigabytes of XML data. — dthorpe, Dec 14 '12 at 17:43
This is a good note... I need to check at the memory usage. I may have to change my code to save RAM. — Distwo, Dec 14 '12 at 19:08

Caleb Keith · Accepted Answer · 2012-12-14T17:27:06.463

1

You can use an LINQ To Xml by using the following code:

IEnumerable<XElement> elements = xmlResponse.Root.Element("OutputXml").Element("Response").Elements("Product");

foreach(XElement element in elements)
{
    // Do Work Here
}

This will filter the list down to just products and it will select them correctly without using an index. Using indexes with xml is not the greatest idea because the xml can change.

edited Dec 14 '12 at 17:27

answered Dec 14 '12 at 01:41

Caleb Keith

816
4
10

Thanks for your answer. I see that I can go through my XML tree using the XmlReader. That may help, but how is it going to be performance like. For example if I want 25 Products starting at index 6500, would I have to increment manually until I reach the good index? LINQ to XML seems more efficient and probably easier, that's why I was try to use the `Elements().Skip().Take()` methods. I don't have an object Product on my client side, so I have to stick with generic objects. – Distwo Dec 14 '12 at 17:10

score 0 · Answer 2 · edited May 23 '17 at 10:24

0

You can use XPathEvaluate to read a subtree.

If your list is empty, chances are it is namespace problem, so you did not account for this namespace in your code xmlns:i="http://www.w3.org/2001/XMLSchema-instance". XDocument/XElement cannot resolve namespaces automatically.

See this topic on how to use namespaces with LINQ-to-XML.

edited May 23 '17 at 10:24

Community

1
1

answered Dec 14 '12 at 02:07

Victor Zakharov

25,801
18
85
151

Extract part of a big XML

2 Answers2