XML:
<node>
Lorem ipsum
<child-node>dolor</child-node>
TEXT TO BE ACCESSED
</node>
<node>
sed do eiusmod tempor etc.
</node>
This is read into an rapidxml::xml_document<>
and parsed with the flag rapidxml::parse_validate_closing_tags
as follows: doc.parse<rapidxml::parse_validate_closing_tags>()
. (I would have thought that this flag solved the issue, but this does not appear to be the case.)
RapidXML C++ code looping through all <node>
s of doc
:
for (const rapidxml::xml_node<> *node = doc.first_node("node"); node != nullptr; node = node->next_sibling()) { std::cout << node->value(); }
node->value()
returns Lorem ipsum during the first loop.
While the text within the <child-node>
(dolor) is accessible by creating a new *node_2 = node->first_child()
(within the loop) and then accessing the value with node_2->value()
, the text that follows the <child node>
(TEXT TO BE ACCESSED) is not accessible in a similar way. The documentation does not offer much in terms of advice. How might this be done with RapidXML?
The XML is intended to encode an edition of a text (following e.g. Perseus Digital Library) and so the format used above is useful in order to mark specific words within sentences etc.