1

I want to loop an XML tree and test two conditions and found some issues.

  1. I can't compare a XElement like an Integer. I've read some other posts (besides this examples) and so far I couldn't figure it out. :S

Link_1

Link_2

  1. In the foreach loop I'm only able to change the first matching element and I would like to do it for all matching XElements. I have tried Any() and All() but wasn't successful.

My XML files: (for protein)

<?xml version="1.0" encoding="utf-8"?>
<ProteinStructure>
  <SecondaryStructure>
    <StName>HELX_P1</StName>
    <StType>alphaHelix</StType>
    <AmAcidStart>1</AmAcidStart>
    <AmAcidEnd>2</AmAcidEnd>
  </SecondaryStructure>
  <SecondaryStructure>
    <StName>HELX_P2</StName>
    <StType>alphaHelix</StType>
    <AmAcidStart>43</AmAcidStart>
    <AmAcidEnd>53</AmAcidEnd>
  </SecondaryStructure>

My XML files: (for atom)

<?xml version="1.0" encoding="utf-8"?>
<Molecule>
  <Atom>
    <AtNum>1</AtNum>
    <AmAcSeq>2</AmAcSeq>
    <AtType>N</AtType>
    <StType>turn</StType>
  </Atom>
  <Atom>
    <AtNum>2</AtNum>
    <AmAcSeq>2</AmAcSeq>
    <AtType>CA</AtType>
    <StType>turn</StType>
  </Atom>
  <Atom>
    <AtNum>2</AtNum>
    <AmAcSeq>2</AmAcSeq>
    <AtType>C</AtType>
    <StType>turn</StType>
  </Atom>
  <Atom>
    <AtNum>1</AtNum>
    <AmAcSeq>3</AmAcSeq>
    <AtType>N</AtType>
    <StType>turn</StType>
  </Atom>

My code so far:

   XDocument atom = XDocument.Load (@"C:\Users\RuiGarcia\Documents\MIB\INESC\C#Tutorial\ProjectC#\Molecule_00\PDBLibary_00\Data\__3Q26.xml");
   XDocument protein = XDocument.Load (@"C:\Users\RuiGarcia\Documents\MIB\INESC\C#Tutorial\ProjectC#\Molecule_00\PDBLibary_00\Data\_3Q26.xml");

   //Change secondary structure tag type "turn" to helixAlpha or betaSheet
   foreach (XElement amAc in protein.Descendants ("SecondaryStructure"))
   {
    atom.Element ("Molecule")
     .Elements ("Atom")
     .Where (x => (int?)x.Element("AmAcSeq") >= (int?)amAc.Element("AmAcidStart") && x => (int?)ToInt16.x.Element ("AmAcSeq") <= (int?)amAc.Element("AmAcidEnd"))
     .Select (x => x.Element("StType")).FirstOrDefault().SetValue(amAc.Element("StType"));
//    Console.WriteLine (amAc.Element("AmAcidStart").Value);
//     .Where (x => Convert.ToInt16(x.Element("AmAcSeq").Value) <= Convert.ToInt16(amAc.Element("AmAcidStart").Value) && x => Convert.ToInt16(x.Element ("AmAcSeq").Value) >= Convert.ToInt16(amAc.Element("AmAcidStart")))
//     .Where (x => (int?)x.Element("AmAcSeq") >= (int?)amAc.Element("AmAcidStart") && x => (int?)ToInt16.x.Element ("AmAcSeq") <= (int?)amAc.Element("AmAcidStart"))
   }

   atom.Save(@"C:\Users\RuiGarcia\Documents\MIB\INESC\C#Tutorial\ProjectC#\Molecule_00\PDBLibary_00\Data\__3Q26.xml");

P.S. - How can I format correctly the code without separating it in the code sample?

Thanks in advance.

Community
  • 1
  • 1
ruiabg
  • 15
  • 4
  • 2
    1) The simplest way to format your code is to auto-format in Visual Studio then copy/paste here. 2) Please share an example of the XML which produces your problem. 3) Your code doesn't even compile -- what is `ToInt16.x.Element ("AmAcSeq")`? If you can create a [Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve) of your problem it's more likely we will be able to help. – dbc Aug 06 '15 at 18:01
  • I would help if you posted a small piece of the xml – jdweng Aug 06 '15 at 18:33
  • Just added my XML example files. – ruiabg Aug 06 '15 at 22:41
  • In the example you give, your `Atom` XElements have `AmAcSeq = 2` which will not match any of your `SecondaryStructure` XElements. – dbc Aug 06 '15 at 22:41
  • The aim is to compare if the number AmAcSeq of atom is between numbers AmAcStart and AmAcEnd of Protein. And if true changes StType of Atom to the StType of the Protein. The StType atom default value is turn. I'll change the example so there is a condition to change both. – ruiabg Aug 06 '15 at 22:46
  • Please provide samples of your input and output, ideally minimal examples that don't have extra elements. That way there is less uncertainty as to exactly what transformation you are trying to make. For instance you are using `FirstOrDefault` instead of calling a second `foreach` loop and it is hard to tell if that is a mistake or not. – Guvante Aug 06 '15 at 22:47
  • Changed my code to match the requirements, so that the last XElement is the only one that maintains the element value StType. I didn't thought about that possibility (two foreach) but the compare numbers tag remains a problem. – ruiabg Aug 06 '15 at 22:54

2 Answers2

0

You need a second foreach to iterate through the results. I lifted some values and included an example below of doing that.

XDocument atom = XDocument.Load (@"...\__3Q26.xml");
XDocument protein = XDocument.Load (@"...\_3Q26.xml");

//Change secondary structure tag type "turn" to helixAlpha or betaSheet
foreach (XElement amAc in protein.Descendants ("SecondaryStructure"))
{
    int? start = (int?)amAc.Element("AmAcidStart");
    int? end = (int?)amAc.Element("AmAcidEnd");
    string stType = (string)amAc.Element("StType");

    IEnumerable<XElement> atoms = atom.Element("Molecule").Elements("Atom");

    // Here we are iterating again in each matching result
    foreach (XElement atomElement in atoms.Where(elem => (int?)elem.Element("AmAcSeq") >= start && (int?)elem.Element("AmAcSeq") <= end))
    {
        atomElement.SetValue(stType);
    }
}
Guvante
  • 18,775
  • 1
  • 33
  • 64
  • I've tried both sollutions. The sollution from _dbc_ works well and selecting directly in the foreach seems to be a good practice. The sollution of _Guvante_ returns two errors, probably because I lack a reference: the error is: "Cannot implicitly convert 'System.Collections.Generic.IEnumerable' to System.XML.Linq.XElement. An explicit conversion exist (missing a cast?) – ruiabg Aug 06 '15 at 23:14
  • @ruiabg: Nope, my mistake. I typically use `var` and forgot that `atoms` is of type `IEnumerable` it should be fixed now. – Guvante Aug 07 '15 at 18:58
0

I see a couple problems here:

  1. You need to use nested foreach loops rather than a FirstOrDefault().
  2. Your condition (int?)ToInt16.x.Element ("AmAcSeq") <= (int?)amAc.Element("AmAcidStart") doesn't even compile, and should almost certainly be something like (int?)x.Element ("AmAcSeq") <= (int?)amAc.Element("AmAcidEnd"))

Thus:

        foreach (XElement secondaryStructure in protein.Descendants ("SecondaryStructure"))
        {
            foreach (var atomStType in atom.Element ("Molecule")
                .Elements ("Atom")
                .Where(a => (int?)a.Element("AmAcSeq") >= (int?)secondaryStructure.Element("AmAcidStart") && (int?)a.Element("AmAcSeq") <= (int?)secondaryStructure.Element("AmAcidEnd"))
                .Select (a => a.Element("StType")))
            {
                atomStType.SetValue((string)secondaryStructure.Element("StType"));
            }
        }
dbc
  • 104,963
  • 20
  • 228
  • 340