0

I have big xml files which have many data (unnecessary for me now) and a very large number of procedures. I want to read the xml from the end to take the last procedure. (From the last START PROCEDURE to the last END PROCEDURE and everything between them)

I tried from the start but it is not efficient because it takes too long to get through all the data, and taking each procedure as last until the next is found.

I also tried to read line-by-line storing in an array and with a for loop i started searching from the end but this takes ages too.

The getElementByTagName is also not useful because I have many start and end procedures and not very efficient if i must go through all of them.

The xml files have the following format: ............. ............. ............. <value> <struct> <member> <name> procedureAction </name> <value> 0 </value> </member> <member> <name> mainType </name> <value> 200 </value> </member> <member> <name> subType </name> <value> 30 </value> </member> <member> <name> time </name> <value> 1890 </value> </member> </struct> </value>

................. ................. ................. .................. <value> <struct> <member> <name> procedureAction </name> <value> 1 </value> </member> <member> <name> mainType </name> <value> 200 </value> </member> <member> <name> subType </name> <value> 30 </value> </member> <member> <name> time </name> <value> 1890 </value> </member> </struct> </value> ............. ............. ............. The procedureAction will value 0 is a START, the procedureAction will value 1 is an END.

How can I read the xml file from the end? The iterator does not work because it needs fixed-size encoding.

Thank you in advance.

The procedureSearch function splits the xml when the "procedureAction" element name is found and saves the piece in a position in the array. this is done for all lines that contain procedureAction. I tried to search for the last "procedureAction value 0" in the array but takes too much time ` private string[] procedureSearch(string inputXmlAsString){

    string[] returnValue = null;
    string[] split;
    string startSplit = "<value><struct><member><name>procedureAction</name><value>";
    string stopSplit = "</value></member></struct></value><value>";
    const int MAX_COUNT = 100000;
    string[] allResults = new string[MAX_COUNT];
    int allResultsCounter = 0;

    while (inputXmlAsString.CompareTo("") == 1) { 
        if (inputXmlAsString.Contains(startSplit)) {

            //if the text is contained in the remaining log...
            //split at the start (throw the split[0], keep split[1])
            split = inputXmlAsString.Split(new string[] { startSplit }, StringSplitOptions.RemoveEmptyEntries);
            string[] tempSplitter = null;
            //gather the remaining
            inputXmlAsString = string.Join(startSplit, split.Skip(1).ToList());
            //and split the remaining at the stop
            tempSplitter = inputXmlAsString.Split(new string[] { stopSplit }, StringSplitOptions.RemoveEmptyEntries);
            //connect from the start to the stop
            allResults[allResultsCounter] = startSplit + tempSplitter[0];
            string tempComp = stopSplit;
            tempSplitter = tempComp.Split(new String[] { "<value" }, StringSplitOptions.RemoveEmptyEntries);
            allResults[allResultsCounter] += tempSplitter[0];
            allResultsCounter++;

        } else {
            inputXmlAsString = "";
        }//if substring exists
        //Console.WriteLine("ReadLog after if: "+ readLog);

    }//while readLog not null

    returnValue = new string[allResultsCounter];
    for (int b = 0; b < allResultsCounter; b++) {
        returnValue[b] = allResults[b];
    }//for
    return returnValue;
}//end of procedureSearch function`

The main is very simple:

string pieceOfXml = System.IO.File.ReadAllText(logPath); const int MAX = 100000; string[] allProcedures = new string[MAX]; int allProceduresCounter = 0; allProcedures = procedureSearch(pieceOfXml);

The result is an array with almost 3000 lines: An example of a piece of a line

<value><struct><member><name>procedureAction</name><value>20</value></member> <member><value><struct><member><name>mainType</name><value>31</value></member> <member><name>subType</name><value>0</value></member></struct></value></member> <member><name>time</name><value>97</value></member></member></struct></value>

  • Please show some code. What have you tried? – afaolek Jul 16 '14 at 13:28
  • Check this [page](http://www.csharp-examples.net/xml-nodes-by-name/) for a start on what to do. – afaolek Jul 16 '14 at 13:51
  • I have used this to split the elements but this is for after taking the last procedure. The XmlNodeType works better because some procedures have more elements than other procedure.. – user3238433 Jul 16 '14 at 14:02
  • `using (XmlReader reader = XmlReader.Create(new StringReader(inputLogEntry[d]))) { XmlWriterSettings ws = new XmlWriterSettings(); ws.Indent = true; while (reader.Read()) { switch (reader.NodeType) { case XmlNodeType.Element:.... case XmlNodeType.Text:.... case XmlNodeType.XmlDeclaration:... case XmlNodeType.ProcessingInstruction:... case XmlNodeType.Comment:... case XmlNodeType.EndElement:... ` – user3238433 Jul 16 '14 at 14:04
  • This is an opportunity to be introduced to XLINQ (http://www.codeproject.com/Articles/18751/XLINQ-Introduction-Part-Of) Reading XML Files as in your example is a long and tedious process. The modern .NET offers much better ways of doing what you want in a easy way. – Roman Mik Jul 16 '14 at 14:07

0 Answers0