0

I'm trying to parse this simply and consise xml-minded structure with Boost::Spirit,

One{
    Two{
        Three{
        }
    }
}

And the code is organized as follows:

Struct definition to keep the spirit-stuff:

struct config;
typedef boost::variant< boost::recursive_wrapper<config> , std::string > config_node;

struct config
{
    std::string name;
    std::vector<config_node> children; 
};


BOOST_FUSION_ADAPT_STRUCT(
    config,
    (std::string, name)
    (std::vector<config_node>, children)
)

( shameless stealed from the xml intro )

Declaration of the rules ( on the parser class )

qi::rule<Iterator, config(), qi::locals<std::string>, ascii::space_type> cfg;
qi::rule<Iterator, config_node(), ascii::space_type> node;
qi::rule<Iterator, std::string(), ascii::space_type> start_tag;
qi::rule<Iterator, void(std::string), ascii::space_type> end_tag;

Definition of the rules, in the parser 'parse' method.

    node = cfg;
    start_tag =  +(char_ -'{') >> '{';
    end_tag = char_('}');

    cfg %=  start_tag[_a = _1]
        >>  *node
        >>  end_tag(_a);

_a and _1 are boost::phoenix variables.

This rules works for the small snipped pasted above, but if I change it to:

One{
    Two{
    }
    Three{
    }
}

( two groups in the same scope, instead of group inside of other group ) the parser fails. and I have no idea why.

Tomaz Canabrava
  • 2,320
  • 15
  • 20

1 Answers1

2

For future reference, your code seems like a simplified version of mini_xml2.cpp from Boost's tutorial (aka "the shamelessly stolen one").

To make your example work, you'd have to change the line:

start_tag =  +(char_ -'{') >> '{';

to

start_tag = +(char_ -'{' - '}') >> '{';

It's pretty self-explanatory now:) Whenever the parser parses a start_tag, it starts looking for nodes (because of the >> *node part). Since } is a legal start_tag, it may be recognized as one, and it shouldn't.


btw There are a few redundancies in your code that you might consider fixing. For instance:

In the original mini_xml2.cpp example, end_tag served as a function checking that you close the same tag as the one opened (hence the signature void(std::string)). You'd be better off with

cfg %=  start_tag[_a = _1]
>> *node
>> "}";

The nodes in the mini_xml2.cpp example where polymorphic, so boost::variant was used along with the visitors. In your example this is also redundant. Honestly, it makes me wonder how the line

node = cfg

didn't cause any problems during compilation, since node has a boost::variant type. FYI, in the original example this line was:

node %= xml | text;

and the %= operator correctly 'guessed' the type of the RHS, since the | operator reads the result as boost::variant.

Richard Pump
  • 588
  • 3
  • 8