1

We're using boost/rapidxml for XML parsing in a pub/sub broker implementation and for a certain XML payload, the broker crashes.

To eliminate as many variables as possible, I have implemented a short test program doing only the XML parse, and the crash is similar.

My short test program is found here:

#include <stdio.h>       // printf
#include <unistd.h>      // read
#include <sys/types.h>   // open
#include <sys/stat.h>    // open
#include <fcntl.h>       // open
#include <errno.h>       // errno
#include <string.h>      // strerror
#include "rapidxml.hpp"
#ifdef LONG_RAPIDXML_NAME_SPACE
// boost version >= 1.45
using namespace boost::property_tree::detail::rapidxml;
#else
// boost version <= 1.44
using namespace rapidxml;
#endif
/* ****************************************************************************
*
* xmlTreePresent -
*/
static void xmlTreePresent(xml_node<>* nodeP, std::string indent, int depth = 0)
{
    static int callNo = 0;
    ++callNo;
    if(nodeP == NULL)
    {
        printf("%sNULL NODE\n", indent.c_str());
    }
    char* name  = nodeP->name();
    char* value = nodeP->value();
    printf("%s%s (%s) (call %d, depth %d)\n", indent.c_str(), name, value, callNo, depth);
    xml_node<>* child = nodeP->first_node();
    while(child != NULL)
    {
        printf("%schild at         %p\n", indent.c_str(), child);
        printf("%schild->name() at %p\n", indent.c_str(), child->name());
        if((child->name() != NULL) && (child->name()[0] != 0))
        {
            xmlTreePresent(child, indent + "  ", depth + 1);
        }
        child = child->next_sibling();
    }
}
/* ****************************************************************************
*
* xmlDocPrepare -
*/
static xml_node<>* xmlDocPrepare(char* xml)
{
    xml_document<> doc;
    try
    {
        doc.parse<0>(xml);
    }
    catch(parse_error& e)
    {
        printf("PARSE ERROR: %s\n", e.what());
        return NULL;
    }
    catch(...)
    {
        printf("GENERIC ERROR during doc.parse\n");
        return NULL;
    }
    xml_node<>* father = doc.first_node();
    return father;
}
/* ****************************************************************************
*
* main -
*/
int main(int argC, char* argV[])
{
    char* fileName = argV[1];
    int   fd;
    if((fd = open(fileName, O_RDONLY)) == -1)
    {
        printf("open('%s'): %s", fileName, strerror(errno));
        exit(1);
    }
    struct stat  statBuf;
    if(stat(fileName, &statBuf) != 0)
    {
        printf("stat('%s'): %s", fileName, strerror(errno));
        exit(2);
    }
    char* buf = (char*) calloc(1, statBuf.st_size + 1);
    if(buf == NULL)
    {
        printf("calloc(%lu): %s", statBuf.st_size + 1, strerror(errno));
        exit(3);
    }
    int nb = read(fd, buf, statBuf.st_size);
    if(nb == -1)
    {
        printf("read('%s'): %s", fileName, strerror(errno));
        exit(4);
    }
    else if(nb != statBuf.st_size)
    {
        printf("read %d characters, wanted %lu", nb, statBuf.st_size);
        exit(5);
    }
    xml_node<>*   father    = xmlDocPrepare((char*) buf);
    xmlTreePresent(father, "");
    return 0;
}

A sample input XML is here: http://pastebin.com/rYiDjP7E

I have tried both in Ubuntu (libboost_serialization.so.1.49.0) and in CentOS (libboost_serialization.so.5) and I get similar crashes.

[ I'm not sure but I think boost property tree resides in the serialization library ... ]

It gets a little further in CentOS but ends up in a similar crash.

More than thankful for help here, we have users waiting for a fix ...

sehe
  • 374,641
  • 47
  • 450
  • 633
kzangeli
  • 403
  • 2
  • 8

1 Answers1

6

First off.

  1. >OMG< You have got quite a mess of code there - sorry to put it bluntly, perhaps.
  2. This isn't c++! Replace your entire main with

    int main(int argc, char* argv[])
    {
        std::ifstream ifs(argc>1? argv[1] : "input.txt");
        std::string buf(std::istreambuf_iterator<char>(ifs), {});
    
        xml_node<>* father = xmlDocPrepare(&buf[0]);
        xmlTreePresent(father, "");
    }
    
  3. rapidxml is not a part of Boost. It's used (implementation detail) in Boost Property Tree

  4. Boost Property Tree is it's own library, not related to Boost Serialization (this means you don't have to link to the shared library either)

The real bug is here:

static xml_node<>* xmlDocPrepare(char* xml)
{
    xml_document<> doc;

    ..
    xml_node<>* father = doc.first_node();
    return father;
}

This returns a reference to (into) a local, but the local (doc) doesn't exist beyond the return statement! I'd fix this instance as follows:

int main(int argc, char* argv[])
{
    std::ifstream ifs(argc>1? argv[1] : "input.txt");
    std::string xml(std::istreambuf_iterator<char>(ifs), {});

    try
    {
        xml_document<> doc;
        doc.parse<0>(&xml[0]);
        xmlTreePresent(doc.first_node());

        return 0;
    }
    catch(parse_error& e) { printf("PARSE ERROR: %s\n", e.what()); }
    catch(...)            { printf("GENERIC ERROR during doc.parse\n"); }
    return 1;
}

Note how I reduced 76 lines of code into 17 :) And leaking less memory.

Update Here's a completely cleaned up version in - HURRAY - c++ rather than C: see it Live On Coliru, where it is seen parsing the sample XML, but with the ignorable whitespace removed so you can see the output better.

It works with the 'pretty' xml as well, of course.

#include <fstream>
#include <iostream>
#include <iomanip>
#include <boost/property_tree/detail/rapidxml.hpp>

using namespace boost::property_tree::detail::rapidxml;

static void xmlTreePresent(xml_node<> const* nodeP, int depth = 0)
{
    static int callNo = 0;
    callNo += 1;
    if(nodeP)
    {
        std::cout << std::setw(depth*2) << "" << nodeP->name()/* << " (" << nodeP->value() << ") (call " << callNo << ", depth " << depth << ")" */ << "\n";
        for (xml_node<>* child = nodeP->first_node(); child; child = child->next_sibling())
        {
            auto name = child->name();
            if(name && name[0])
            {
                xmlTreePresent(child, depth + 1);
            }
        }
    } else
    {
        std::cout << std::setw(depth*2) << "" << "NULL NODE\n";
    }
}

int main(int argc, char* argv[])
{
    std::ifstream ifs(argc>1? argv[1] : "input.txt");
    std::string xml(std::istreambuf_iterator<char>(ifs), {});

    try
    {
        xml_document<> doc;
        doc.parse<0>(&xml[0]);
        xmlTreePresent(doc.first_node());

        return 0;
    }
    catch(parse_error& e) { printf("PARSE ERROR: %s\n", e.what()); }
    catch(...)            { printf("GENERIC ERROR during doc.parse\n"); }
    return 1;
}

Here's the output with the debug info removed (as shown in the /comment/):

updateContextRequest
contextElementList
    contextElement
    entityId
        id
    contextAttributeList
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
    contextElement
    entityId
        id
    contextAttributeList
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
    contextElement
    entityId
        id
    contextAttributeList
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
    contextElement
    entityId
        id
    contextAttributeList
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
    contextElement
    entityId
        id
    contextAttributeList
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
        contextAttribute
        name
        type
        contextValue
        metadata
            contextMetadata
            name
            type
            value
updateAction
sehe
  • 374,641
  • 47
  • 450
  • 633
  • 1
    Update, I've cleaned up the whole program now, and [it's now ~40 lines of code, **Live On Coliru**](http://coliru.stacked-crooked.com/a/8ca3cbe198091941). _(I hope this helped, and will drive a lesson home: write C++, not C. It's more productive, and way less error prone.)_ – sehe Apr 28 '14 at 20:01
  • In fact I just reduced it to **[~25 lines](http://coliru.stacked-crooked.com/a/3ef9372a7aaf0045)**. That would not pass code-review with me, but it does show that it's essentially just 3 functions of ~5 lines. – sehe Apr 28 '14 at 22:25
  • Thank you sehe for your quick reply. I moved the xmlDocPrepare part (especially the xml_document<> doc variable) into the parsing routine and everything works just great. About "C" or "C++", well that is a matter of opinion. I myself prefer "C". Thank you again! – kzangeli Apr 29 '14 at 07:47
  • @kzangeli Ok, fair enough. We can disagree, but yeah, taste comes into it too (or someone would be correcting me saying "use Python", "use Haskell" etc. :)) – sehe Apr 29 '14 at 09:18