0

why do I have runtime error, while parsing string whith that grammar?

template <typename Iterator, typename Skipper>
struct grammar : qi::grammar<Iterator, QVariant(), Skipper>
{
  grammar() : grammar::base_type(object)
  {
    identifier = qi::raw[qi::lexeme[qi::alpha >> *(qi::alnum | '_' | ('-' >> qi::alnum))]];

    self = (qi::raw[qi::lexeme["self"]]);
    object = (self >> '.' >> identifier)
            |(object >> '.' >> identifier); // there is no runtime error without that line
  }
}

Any other grammatics run good, but I want to parse something like that:

self.foo.bar2.baz

Runtime error throws at

     qi::phrase_parse(it, str.end(), g, ascii::space, v) && it == str.end())

call.

DmitryU
  • 31
  • 4
  • What is the declared type of the rules? In other words, can you make the sample proper (SSCCE/MVCE) – sehe Apr 15 '16 at 16:37

2 Answers2

0

It seems to me that the object rule, being the starting point, must be declared as

qi::rule<It, QVariant(), Skipper> object;

Although I have no clue what QVariant is, I know this:

For attribute propagation to work, you need to have attribute type compatibility using the builtin Qi transformation heuristics.

For the first branch (self>>'.'>>identifier) this /could/ be simple enough. Let's assume identifier synthesizes a string-compatible attribute (std::string or std::vector<char> e.g.) then the resulting attribute could legally be assigned as a string.

The Sample

As a simple example, look at this (where I "emulate" something like what QVariant could be):

Live On Coliru

#include <boost/spirit/include/qi.hpp>

namespace qi = boost::spirit::qi;

using QVariant = boost::variant<std::string, int>;

template <typename Iterator, typename Skipper>
struct grammar : qi::grammar<Iterator, QVariant(), Skipper>
{
    grammar() : grammar::base_type(object)
    {
        identifier = qi::raw[qi::lexeme[qi::alpha >> *(qi::alnum | '_' | ('-' >> qi::alnum))]];

        self   = (qi::raw[qi::lexeme["self"]]);
        object = 
             qi::as_string [self >> '.' >> identifier]
            //|qi::as_string [object >> '.' >> identifier] // there is no runtime error without that line
            ;
    }
  private:
    qi::rule<Iterator, QVariant(), Skipper> object;
    qi::rule<Iterator, std::string(), Skipper> identifier;
    qi::rule<Iterator, std::string(), Skipper> self;
};

int main() {
    using It = std::string::const_iterator;
    std::string input = "self.foo.bar2.baz";

    It f = input.begin(), l = input.end();
    QVariant parsed;
    bool ok = qi::phrase_parse(f, l, grammar<It, qi::space_type>{}, qi::space, parsed);

    if (ok)
        std::cout << "Parsed: " << parsed << "\n";
    else
        std::cout << "Parse failed\n";

    if (f!=l)
        std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
}

Printing:

Parsed: selffoo
Remaining unparsed: '.bar2.baz'

The Problem

The second branch

qi::as_string [object >> '.' >> identifier]

would have to synthesize to tuple<QVariant, std::string> to be consistent with the rest of the declarations. There is no way for Spirit to automatically transform that. The heuristic system might start grabbing at straws, and try to treat the bound attribute (remember, this is the enigmatic QVariant) as a container. If it succeeds at this¹ things would compile. Obviously, at runtime things come crashing down because the incorrect interfaces are invoked for the actual - runtime - value of the QVariant.

This is the theory.

A Solution?

Looking at the working demo, note that the '.' is excluded. This leads me to suspect that you actually do not want any complicated chained "list" of object dereferences, but instead might just want to treat the whole matched input as a raw string? In that case, the simplest solution would be to lift the raw[] a level, and perhaps use a string instead of QVariant.


¹ e.g. because QVariant interface is slightly sloppy/unsafe and exposes .begin/.end/value_type/insert members directly on the variant interface?

sehe
  • 374,641
  • 47
  • 450
  • 633
  • Thank you for your help. Anyway, promlem is anywhere else, I suppose, parsing problem. It runs without runtime errors with that rule: identifier >> '.' >> object (in rewerse order). Also, it runs with prefix minus rule (exrp = '-' >> expr), but there is runtime error in such rule: expr = expr >> '-'. Maybe, there is a trick with parser type, that dont support such grammatics – DmitryU Apr 25 '16 at 07:50
  • You never told use what you want to achieve. If you start over (new question) just saying what you want to get (e.g. "I have `System.Console.WriteLine` and I want to get a `std::vector { "System", "Console", "WriteLine" }`" (but _differently_, because that's probably too simple)) then we can help you by showing how you can use Spirit to achieve it. See http://www.perlmonks.org/?node_id=542341 – sehe Apr 25 '16 at 08:00
0

Left recursions like "A = (A >> a ) | b" are unavaible in LL-parsers like boost::spirit. They should be transformed to LL-friendly form: A = bR R = aR | e Where R - new non-terminal and e - epsilon (empty terminal).

DmitryU
  • 31
  • 4