0

I need to read all the elements in an XML file that has a format which is more of a tree-like hierarchy, and then populate a class with it here is a sample:

<?xml version="1.0"?>
<WBs>
  <WP GeneralID ="1">
    <P name="General_Header">
      <Q  name= "Category">
        <Tools>
          <Tool id ="1">
            <TName> QT 1</TName>
            <Rev>1</Rev>
          </Tool>
          <Tool id ="2">
            <TName> QT 2</TName>
            <Rev>3</Rev>
          </Tool>
        </Tools>
        <Contacts>
          <Contact>
            <CName>MM</CName>
            <CMail>m.m@i.com</CMai>
          </Contact>
          <Contact>
            <CName>AM</CName>
            <CMail>a.m@i.com</CMail>
          </Contact>
        </Contacts>
      </Q>
      <ss  name= "Category">
        <Tools>
          <Tool id ="1">
            <TName> SST 1</TName>
            <Rev>3</Rev>
          </Tool>
          <Tool id ="2">
            <TName> SST 2</TName>
            <Rev>3</Rev>
          </Tool>
        </Tools>
        <Contacts>
          <Contact>
            <CName>KE</CName>
            <CMail>K.E@i.com</CMai>
          </Contact>
          <Contact>
            <CName>AM</CName>
            <CMail>a.m@i.com</CMail>
          </Contact>
        </Contacts>
      </ss>
</P>
</WP>
</WBs>

I have made a class for each of the WP, tool, contact. as follows:

class WP
{
    public string GeneralHeader { get; set; } //level 1 i.e p, 
    public string Category { get; set; }// level 2
    public string SubCategory { get; set; }//level 3
    public string SubSubCategory { get; set; } // level 4
    public List<Tool> WPTools { get; set; }
    public List<Contact> WPContacts { get; set; }
}

What I want to do is to traverse through all of the element, and child elements and then populate the WP Class such that whenever I encounter two child elements this should be in two different Objects of the WP, but has the same parent attribute.

For example: For the above sample i was hoping to get two objects from the WP Class, withe same "General_Header" parameter as "P", but one object is having the "Category" equal to "Q" and the other is having "SS", then continue to populate each one with the corresponding tools and contacts. the idea is the same as the rest of the XML file have the same issue at different levels for example: WP Tags is having branches in the "SubCategory", and others in the "SubSubCategory".

All that I can think of is to change the xml file so that every complete branch (till the tools and contacts tags) is included in a separate set of <WP>---</WP> tags, but in this case I would repeat the common parent tags, which I don't think is an efficient way of using xml.

Any suggestions?

Thanks in advance.

JNYRanger
  • 6,829
  • 12
  • 53
  • 81
user1874288
  • 135
  • 2
  • 11
  • are you familiar with DataSet load from xml method..? you can load this into the Class object very easily from there .. try a google search for great example `C# Stackoverflow XML To DataSet` tons of working examples. – MethodMan Mar 09 '15 at 18:27
  • thanks for your fast response. i will check the Dataset. – user1874288 Mar 09 '15 at 20:21

1 Answers1

0

This is really a peculiar piece of XML. Usually we would see this structure:

<Category name= "Q">

Instead of this:

<Q name = "Category">

I'd say it's really all down to what you're going to do with that XML after, and as I don't really know anything about it, I'll give the benefit of doubt and assume this is the correct structure. But if it's not, please change it.

As you said, you don't want to repeat <P name="General_Header"> over and over again for each category and subcategory.

First off, let's parse your xml text (or load if you're loading from a file):

XDocument document = XDocument.Parse(content);

Now let's get the header using a bit of Linq:

var generalHeader = document.Descendants()
                            .Where(p=>p.Attributes("name")
                                       .Any(a=>a.Value=="General_Header"))
                            .FirstOrDefault();

Now let's get all Categories, SubCategories, SubSubCategories altogether:

var allCategories = generalHeader.Descendants()
                                 .Where(p=>p.Attributes("name")
                                            .Any(a=>new[]{"Category","SubCategory","SubSubCategory"}
                                                         .Contains(a.Value)));

Ok, now we need to create a WPclass for each category we have. But we also have to fill the Category, SubCategory and SubSubCategory properties. So we need to know what the corresponding categories are (cat, sub or subsub). For this I've created the following method:

public static IEnumerable<WP> CreateWP(XElement header, IEnumerable<XElement> categories)
{       
    foreach(XElement category in categories)
    {
        WP wp = new WP();
        wp.GeneralHeader = header.Name.LocalName;
        wp.Category = category.Ancestors().Concat(new []{category}).Where(c=> c.Attributes("name").Any(a=>a.Value == "Category")).Select(elem=>elem.Name.LocalName).FirstOrDefault();
        wp.SubCategory = category.Ancestors().Concat(new []{category}).Where(c=> c.Attributes("name").Any(a=>a.Value == "SubCategory")).Select(elem=>elem.Name.LocalName).FirstOrDefault();
        wp.SubSubCategory = category.Ancestors().Concat(new []{category}).Where(c=> c.Attributes("name").Any(a=>a.Value == "SubSubCategory")).Select(elem=>elem.Name.LocalName).FirstOrDefault();

        XmlSerializer xt = new XmlSerializer(typeof(Tool));
        wp.WPTools = category.Descendants("Tool").Select(t=> (Tool) xt.Deserialize(t.CreateReader())).ToList();

        XmlSerializer xc = new XmlSerializer(typeof(Contact));
        wp.WPContacts = category.Descendants("Contact").Select(t=> (Contact) xc.Deserialize(t.CreateReader())).ToList();
        yield return wp;
    }
}

The important line is this:

category.Ancestors().Concat(new []{category})
                    .Where(c=> c.Attributes("name").Any(a=>a.Value == "Category"))
                    .Select(elem=>elem.Name.LocalName).FirstOrDefault();

It finds (in itself or ALL of its ancestors) for a name which equals "Category", or returns null if none is found.

Basically, what we're doing here is flattening all elements (all kinds of categories) and creating one object for each category found. I've used this file to test this:

http://pastebin.com/raw.php?i=cAUnhUgf

And here's the complete fiddle so you can see for yourself:

https://dotnetfiddle.net/ZBvmhe

EDIT Explaining this statement:

var allCategories = generalHeader.Descendants() 

This gets all descendant elements (including category, subcategory, tool, contact, everything)

.Where(p=>p.Attributes("name")       
                            .Any(a=>new[]{"Category","SubCategory","SubSubCategory"}.Contains(a.Value)));

And this translates as

Where any of the descendant p's attributes whose XName is name has its value set as either Category, SubCategory, or SubSubCategory

Which means:

p.Attributes("name")  

Gets all attributes of p identified as "name"

.Any(a=>new[]{"Category","SubCategory","SubSubCategory"}.Contains(a.Value))

This returns true whether any of the values inside the name attribute is equal to one of 3 possibilities (Category, SubCategory or SubSubCategory), or false otherwise.

In short terms, it will get all descendants of the general header that are either a Category, SubCategory or SubSubCategory.

Conrad Clark
  • 4,533
  • 5
  • 45
  • 70
  • Thanks so much for your fast and detailed answer. I will check it out. BTW regarding the XML Format that was my opinion as well, i will try to change it. – user1874288 Mar 09 '15 at 20:23
  • Can you please explain more the line: `var allCategories = generalHeader.Descendants() .Where(p=>p.Attributes("name") .Any(a=>new[]{"Category","SubCategory","SubSubCategory"} .Contains(a.Value)));` – user1874288 Mar 10 '15 at 00:18
  • @user1874288 Explained. See if it fits. – Conrad Clark Mar 10 '15 at 11:35