2

I want to extract an individual node using POCO's libraries but can't figure out how to do it. I'm new to XML.

The XML itself looks something like this (abbreviated):

<?xml version="1.0" encoding="UTF-8"?>
<!-- Created by XMLPrettyPrinter on 11/28/2012 from  -->
<sbml xmlns = "http://www.sbml.org/sbml/level2/version4" level = "2" version = "4">
<model id = "cell">
  <listOfSpecies>
</listOfSpecies>
  <listOfParameters>
     <parameter id = "kk1" value = "1"/>
  </listOfParameters>
  <listOfReactions>
     <reaction id = "J1" reversible = "false">
... much stuff here ..
  </listOfReactions>
</model>
</sbml>

I want to extract everything in the listOfReactions node and store it in a std::string, for later MD5 hashing.

I have tried this:

ifstream in(JoinPath(gTestDataFolder, "Test_1.xml").c_str());
InputSource src(in);
DOMParser parser;
AutoPtr<Document> pDoc = parser.parse(&src);
NodeIterator it(pDoc, Poco::XML::NodeFilter::SHOW_ALL);
Node* pNode = it.nextNode();

while(pNode)
{
    clog<<pNode->nodeName()<<endl;
    string elementID = "listOfReactions";
    if(pNode->nodeName() == "listOfReactions")
    {
         //Extract everything in this node... how???
    }

    pNode = it.nextNode();
}
mhucka
  • 2,143
  • 26
  • 41
Totte Karlsson
  • 1,261
  • 1
  • 20
  • 55
  • 2
    So, what have you tried? – Alon Mar 25 '13 at 22:30
  • I have tried their example, involving a DOM parser. I am able to get to the node, but don't know hot to extract its content. I can print pNode->getNodeValue(), but that prints nothing.. – Totte Karlsson Mar 25 '13 at 22:34
  • The node "listOfReactions" does not hold the value. It has a child node with the name "#text" holding the value. It is not obvious, but when doing "pNode->firstNode()->nodeValue()" you will get the value. – N. Nowak Jan 20 '17 at 15:09

4 Answers4

6

I ran into a similar problem myself. For instance in your case with the "Poco::XML::NodeFilter::SHOW_ALL" filter applied, all node types(Element, Text, CDataSection, etc) will be included when iteratering through the XML document. I found that POCO does not include all the data in each node it returns from "NextNode()".

If one wants to access an XML nodes attributes, one first has to query the node to check whether it has any attributes using "hasAttributes()" and then if it does, iterate through each of these attributes to find the ones of interest.

XML Example:

<?xml version="1.0"?>
<reaction id="J1" reversible="false">

C++ Example:

...
Poco::XML::NamedNodeMap* attributes = NULL;
Poco::XML::Node* attribute = NULL;

while(pNode)
{
 if( (pNode->nodeName() == "reaction") && pNode->hasAttributes())
 {
   attributes = pNode->attributes(); 
   for(unsigned int i = 0; i < attributes->length(); i++)
   {
     attribute = attributes->item(i);
     cout << attribute->nodeName() << " : " << attribute->nodeValue() << endl
   }
  }
  pNode = it.nextNode();
}
...

Should output:

id : J1
reversible : false

If one wants to access the text between two XML tags, as shown in the XML example below, one first has to find the node with a name that matches the tag of interest, as you have done in your example, and then check the next node by calling "NextNode()" to see if this node has the node name "#text" or "#cdata-section". If this is the case, the value of this "next node" will contain the text between the XML tags.

XML Example:

<?xml version="1.0"?>
<listOfReactions>Some text</listOfReactions>

C++ Example:

...
while(pNode)
{
 if(pNode->nodeName() == "listOfReactions")
 {
   pNode = it.nextNode();
   if(pNode->nodeName() != "#text")
   {
     continue; //No text node present
   }
   cout << "Tag Text: " << pNode->nodeValue() << endl;
  }
  pNode = it.nextNode();
}
...

Should output:

Some text
Poul
  • 61
  • 1
  • 2
0

Try these slides, and the Poco documentation for the API reference.

There is also a nice tutorial here which has a simple to understand example of what you're trying to do.

JBentley
  • 6,099
  • 5
  • 37
  • 72
0

Late to the game but maybe still useful. I am looking into Poco XML to extract weather data delivered in xml. I found the PDF-slide @JBently mention as a good introduction. This provides the hpp-file. This example covers the implementation. I omitted LexicalHandler.

I look after the string listOfReactions, and when found I add attribute-name and -value to a string in startElement(). In characters() I add the text in the node to the string and add it to a vector which can be traversed.

Output:

id=J1,reversible=false,false move
id=J2,reversible=true,true move

I changed your xml slightly for testing, and escaped double-quotes for use in the program.

<?xml version=\"1.0\" encoding=\"UTF-8\"?><sbml xmlns = \"http://www.sbml.org/sbml/level2/version4\" level = \"2\" version = \"4\">
    <model id = \"cell\">
        <listOfSpecies>species</listOfSpecies>
        <listOfParameters>
            <parameter id = \"kk1\" value = \"1\"/>
        </listOfParameters>
        <listOfReactions>
            <reaction id = \"J1\" reversible = \"false\">false move</reaction>
            <reaction id = \"J2\" reversible = \"true\">true move</reaction>
        </listOfReactions>
    </model>
</sbml>

main:

#include <iostream>

#include "MyHandler.hpp"

using namespace std;

int main() {
    auto s = {XML file from above};
    MyHandler handler {};
    Poco::XML::SAXParser parser {};
    parser.setFeature(Poco::XML::XMLReader::FEATURE_NAMESPACES, false);
    parser.setFeature(Poco::XML::XMLReader::FEATURE_NAMESPACE_PREFIXES, true);
    parser.setContentHandler(&handler);

    try {
        parser.parseString(s);
    } catch (Poco::Exception& e) {
        cerr << e.displayText() << endl;
    }
    auto saved = handler.saved_reactions();
    for (auto& i : saved) {
        cout << i << endl;
    }
    return 0;
}

MyHandler.hpp:

#ifndef MYHANDLER_HPP_
#define MYHANDLER_HPP_

#include <iostream>
#include <vector>
#include <Poco/SAX/Attributes.h>
#include <Poco/SAX/ContentHandler.h>
#include <Poco/SAX/SAXParser.h>

class MyHandler: public Poco::XML::ContentHandler {
public:
    MyHandler();
    virtual ~MyHandler();

    // ContentHandler overrides, begin.
    void setDocumentLocator(const Poco::XML::Locator* loc);
    void startDocument();
    void endDocument();
    void startElement(
            const Poco::XML::XMLString&,
            const Poco::XML::XMLString&,
            const Poco::XML::XMLString&,
            const Poco::XML::Attributes&);
    void endElement(
            const Poco::XML::XMLString&,
            const Poco::XML::XMLString&,
            const Poco::XML::XMLString&);
    void characters(const Poco::XML::XMLChar ch[], int, int);
    void ignorableWhitespace(const Poco::XML::XMLChar ch[], int, int);
    void processingInstruction(const Poco::XML::XMLString&, const Poco::XML::XMLString&);
    void startPrefixMapping(const Poco::XML::XMLString&, const Poco::XML::XMLString&);
    void endPrefixMapping(const Poco::XML::XMLString&);
    void skippedEntity(const Poco::XML::XMLString&);
    // ContentHandler overrides, end
    std::vector<std::string> saved_reactions();

private:
    bool show = false;
    std::string reactions_s {};
    std::vector<std::string> reactions_v {};
};

#endif /* MYHANDLER_HPP_ */

MyHandler.cpp:

#include "MyHandler.hpp"

MyHandler::MyHandler() {}
MyHandler::~MyHandler() {}

void MyHandler::setDocumentLocator(const Poco::XML::Locator* loc) {
}

void MyHandler::startDocument() {
}

void MyHandler::endDocument() {
}

void MyHandler::startElement(const Poco::XML::XMLString& namespaceURI, const Poco::XML::XMLString& localName, const Poco::XML::XMLString& qname, const Poco::XML::Attributes& attributes) {
    int x {0};
    std::cout << "qname: " << qname << std::endl;
    /*    std::cout << "getValue(): " << attributes.getValue(qname) << std::endl;
    std::cout << "getLength(): " << attributes.getLength() << std::endl;*/
    if (qname == "listOfReactions") {
        show = true;
    }
    if (show) {
        if (attributes.getLength()) {
            reactions_s.clear();
            x = attributes.getLength();
            for (int i = 0; i < x; ++i) {
                std::cout << "getQName(): " << attributes.getQName(i) << ", getValue(): " << attributes.getValue(i) << std::endl;
                if (reactions_s.size()) reactions_s += ",";
                reactions_s += attributes.getQName(i) + "=" + attributes.getValue(i);
            }
        }
    }
}

void MyHandler::endElement(const Poco::XML::XMLString& allocator,
        const Poco::XML::XMLString& allocator1,
        const Poco::XML::XMLString& allocator2) {
}

void MyHandler::characters(const Poco::XML::XMLChar ch[], int start, int length) {
    std::cout << std::string(ch + start, length) << std::endl;
    if (show) {
        reactions_s += "," + std::string(ch + start, length);
        reactions_v.emplace_back(reactions_s);
    }
}

void MyHandler::ignorableWhitespace(const Poco::XML::XMLChar ch[], int start, int length) {
}

void MyHandler::processingInstruction(const Poco::XML::XMLString& allocator, const Poco::XML::XMLString& allocator1) {
}

void MyHandler::startPrefixMapping(const Poco::XML::XMLString& allocator, const Poco::XML::XMLString& allocator1) {
}

void MyHandler::endPrefixMapping(const Poco::XML::XMLString& allocator) {
}

std::vector<std::string> MyHandler::saved_reactions() {
    return reactions_v;
}

void MyHandler::skippedEntity(const Poco::XML::XMLString& allocator) {
}
kometen
  • 6,536
  • 6
  • 41
  • 51
-1

Assuming XML below in file "hello.xml"

<root>
    <headers>
        <header>Hello</header>
        <header>World</header>
    </headers>
</root>

One could parse this thus:-

#include <string>
#include <sstream>
#include <Poco/Exception.h>
#include <Poco/AutoPtr.h>
#include <Poco/Util/XMLConfiguration.h>

using namespace std;
using namespace Poco;
using namespace Poco::Util;

int main(int argc, char*argv[]) {

    int counter = 0;
    AutoPtr apXmlConf(new XMLConfiuration("hello.xml"));
    try {
        while(1) { // Loop breaks by Poco exception
            stringstream tag;
            tag << "headers.header[" << counter++ << "]";
            string header = apXmlConf->getString(tag.str());
            cout << header << " ";
        }
    } catch(NotFoundException& e) { (void)e; }
    cout << endl;
    return 0;
}

Hope that helps.

AjK
  • 1
  • 1
  • 2