1

In a function, I use pugi to first load an XML file. I then traverse the child xml nodes of the tree and push some of the child xml nodes (objects of type xml_node) to a vector of xml_node. But as soon as I exit this function, the original XML tree structure object loaded from the XML file is deleted causing elements in vector of xml nodes to become invalid.

Below is a sample code(written quickly) to show this:

#include "pugixml.hpp"
#include <vector>

void ProcessXmlDeferred(  std::vector<pugi::xml_node> const &subTrees )
{
   for( auto & const node: subTrees)
   {
       // parse each xml_node node
   }
}

void IntermedProcXml( pugi::xml_node const &node)
{
   // parse node
}

std::vector<pugi::xml_node> BuildSubTrees(pugi::xml_node const & node )
{
  std::vector<pugi::xml_node> subTrees;

  pugi::xml_node temp = node.child("L1");
  subTrees.push_back( temp );

  temp = node.child.child("L2");
  subTrees.push_back( temp );

  temp = node.child.child.child("L3");
  subTrees.push_back( temp );

  return subTrees;
}

void LoadAndProcessDoc( const char* fileNameWithPath, std::vector<pugi::xml_node> & subTrees )
{
    pugi::xml_document doc;
    pugi::xml_parse_result result = doc.load( fileNameWithPath );

    subTrees = BuildSubTrees( result.child("TOP") );
    IntermedProcXml( result.child("CENTRE") );

    // Local pugi objects("doc" and "result") destroyed at exit of this 
    // function invalidating contents of xml nodes inside vector "subTrees"
}


int main()
{
    char fileName[] = "myFile.xml";
    std::vector<pugi::xml_node> myXmlSubTrees;  

    // Load XML file and return vector of XML sub-tree's for later parsing
    LoadAndProcessDoc( fileName, myXmlSubTrees );

    // At this point, the contents of xml node's inside the vector  
    // "myXmlSubTrees" are no longer valid and thus are unsafe to use

    // ....
    // Lots of intermediate code
    // ....

    // This function tries to process vector whose xml nodes 
    // are invalid and thus throws errors
    ProcessXmlDeferred( myXmlSubTrees );

    return 0;
}

I therefore need a way to save/copy/clone/move sub-tree's(xml nodes) of my original XML tree such that I can safely parse them at a later point even after the original XML root tree object is deleted. How to do this in pugi ?

nurabha
  • 1,152
  • 3
  • 18
  • 42
  • Don't delete the "original XML root tree object". Keep it alive, so that all its subtrees remain alive. – arayq2 Jul 22 '15 at 13:46

1 Answers1

1

Just pass the ownership of the xml_document object to the caller.

You can either do it by forcing the caller to supply an xml_document object (add a xml_document& argument to the function), or by returning a shared_ptr<xml_document> or unique_ptr<xml_document> from the function alongside the node vector.

zeuxcg
  • 9,216
  • 1
  • 26
  • 33
  • 1
    Thanks for the answer. Actually, I cannot pass ownership of the entire XML tree to the caller as another function needs to read some parts of the XML tree. I need to somehow move or clone or deep copy only the sub-tree which I want to process later. What is the efficient way to do this ? – nurabha Jul 21 '15 at 12:31
  • I updated the question with code which explains what I am trying to do. – nurabha Jul 21 '15 at 14:38
  • Why can't you pass the ownership? You would change function signature to ```void LoadAndProcessDoc( const char* fileNameWithPath, pugi::xml_document& doc, std::vector & subTrees )``` and create the document in main() – zeuxcg Jul 23 '15 at 03:29
  • 1
    In pugixml the memory management model is set up so that every ```pugi::xml_node``` belongs to a document - you can not have nodes that do not belong to any document. – zeuxcg Jul 23 '15 at 03:29
  • 1
    Thanks for explaining. Yesterday I went through documentation from scratch and did read about your point about why xml nodes cant exist in isolation without xml document. I just created an empty xml document object and appended copy of all the relevant xml nodes to it. This solves the problem. Doesn't seem very efficient but works. If you say so, I could update my question with this method ? If there is a better method, please do let me know – nurabha Jul 23 '15 at 08:27
  • Actually, the code in my question is just representation of huge piece of code I am dealing with. I cannot change the function signatures because then I will have to make lot of code changes everywhere else – nurabha Jul 23 '15 at 10:57
  • 1
    Yeah, in this case probably you should use a global document as the node storage. The downside is that you'll have to remember to remove the nodes, and it's less efficient since you'll have to copy nodes from one document from the other. Still, this is a reasonable workaround. – zeuxcg Jul 24 '15 at 06:13