3

I came across a strange behaviour of boost spirit x3, after I splittet my grammar up into the recommended parser.hpp, parser_def.hpp, parser.cpp files. My example gramar parses some kind of easy enums:

enum = "enum" > identifier > "{" > identifier % "," > "}

this is my enum grammar. When I don't split the enum and identifier parser into the recommended files, everything works fine, especially the string "enum {foo, bar}" throws an expectation failure, as expected. This example can be found here: unsplitted working example

But when I split the exactly same grammar up into the different files, the parser throws

terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_M_construct null not valid

trying to parse the same string "enum {foo, bar}"

this example can be found here: splitted strange example

  1. ast.hpp

    #pragma once
    
    #include <vector>
    #include <string>
    #include <boost/fusion/include/adapt_struct.hpp>
    
    
    
    namespace ast{
    
    namespace x3 = boost::spirit::x3;
    
    struct Enum {
        std::string _name;
        std::vector<std::string> _elements;
    };
    
    
    }
    
    BOOST_FUSION_ADAPT_STRUCT(ast::Enum, _name, _elements)
    
  2. config.hpp

    #pragma once 
    
    #include <boost/spirit/home/x3.hpp>
    
    namespace parser{
    
        namespace x3 = boost::spirit::x3;
    
        typedef std::string::const_iterator iterator_type;
        typedef x3::phrase_parse_context<x3::ascii::space_type>::type context_type;
    
    }
    
  3. enum.cpp

    #include "enum_def.hpp"
    #include "config.hpp"
    
    namespace parser { namespace impl {
         BOOST_SPIRIT_INSTANTIATE(enum_type, iterator_type, context_type)
    }}
    
    namespace parser {
    
    const impl::enum_type& enum_parser()
    {
        return impl::enum_parser;
    }
    
    }
    
  4. enum_def.hpp

    #pragma once
    
    #include "identifier.hpp"
    #include "enum.hpp"
    #include "ast.hpp"
    
    namespace parser{ namespace impl{
    
        namespace x3=boost::spirit::x3;
    
        const enum_type enum_parser = "enum";
    
        namespace{
            const auto& identifier = parser::identifier();
        }
        auto const enum_parser_def =
            "enum"
            > identifier
            > "{"
            > identifier % ","
            >"}";
    
        BOOST_SPIRIT_DEFINE(enum_parser)
    }}
    
  5. enum.hpp

    #pragma once
    
    #include <boost/spirit/home/x3.hpp>
    #include "ast.hpp"
    
    namespace parser{ namespace impl{
        namespace x3=boost::spirit::x3;
    
        typedef x3::rule<class enum_class, ast::Enum> enum_type;
    
        BOOST_SPIRIT_DECLARE(enum_type)
    
    }}
    
    namespace parser{
        const impl::enum_type& enum_parser();
    }
    
  6. identifier.cpp

    #include "identifier_def.hpp"
    #include "config.hpp"
    
    namespace parser { namespace impl {
         BOOST_SPIRIT_INSTANTIATE(identifier_type, iterator_type, context_type)
    }}
    
    namespace parser {
    
    const impl::identifier_type& identifier()
    {
        return impl::identifier;
    }
    
    }
    
  7. identifier_def.hpp

    #pragma once
    #include <boost/spirit/home/x3.hpp>
    #include "identifier.hpp"
    
    namespace parser{ namespace impl{
    
        namespace x3=boost::spirit::x3;
    
        const identifier_type identifier = "identifier";    
    
        auto const identifier_def = x3::lexeme[
            ((x3::alpha | '_') >> *(x3::alnum | '_'))
        ];
    
        BOOST_SPIRIT_DEFINE(identifier)
    }}
    
  8. identifier.hpp

    #pragma once
    #include <boost/spirit/home/x3.hpp>
    
    namespace parser{ namespace impl{
        namespace x3=boost::spirit::x3;
    
        typedef x3::rule<class identifier_class, std::string> identifier_type;
    
        BOOST_SPIRIT_DECLARE(identifier_type)
    }}
    
    
    namespace parser{
        const impl::identifier_type& identifier();
    }
    
  9. main.cpp

    #include <boost/spirit/home/x3.hpp>
    #include "ast.hpp"
    #include "enum.hpp"
    
    namespace x3 = boost::spirit::x3;
    
    template<typename Parser, typename Attribute>
    bool test(const std::string& str, Parser&& p, Attribute&& attr)
    {
        using iterator_type = std::string::const_iterator;
        iterator_type in = str.begin();
        iterator_type end = str.end();
    
        bool ret = x3::phrase_parse(in, end, p, x3::ascii::space, attr);
        ret &= (in == end);
        return ret;
    
    }
    
    int main(){
        ast::Enum attr;
        test("enum foo{foo,bar}", parser::enum_parser(), attr);
        test("enum {foo,bar}", parser::enum_parser(), attr);    
    }
    

Is this a bug, am I missing something, or is this an expected behaviour?

EDIT: here is my repo with an example which throws an std::logic_error instead of an expectation_failure

Exagon
  • 4,798
  • 6
  • 25
  • 53
  • I can't reproduce it. Not looked at the code but works for me. GCC/clang with boost 1.63. By the way, please package it up more conveniently? – sehe Jan 21 '17 at 15:06
  • The code in the wandbox doesnt reproduce the behaviour? I think the problem happens when the identifier parser ist splittet up – Exagon Jan 21 '17 at 15:11
  • I said *I* cannot reproduce it. With that code. In case you prefer: https://gist.github.com/sehe/a7d536e5987ad07bb87496348eead2d1 – sehe Jan 21 '17 at 15:14
  • You dont need the enum_parser.hpp and the id.hpp to reproduce the logic error, but I will have a look in 5mins. – Exagon Jan 21 '17 at 15:22
  • I am not able to compile your example, I will set up a git repo with an example – Exagon Jan 21 '17 at 15:36
  • You couldn't be more specific than "not able" by any chance? – sehe Jan 21 '17 at 15:54
  • [here](https://github.com/bitwalda/SpiritX3StrangeSplitted) is my repo which contains an example which throws an `std::logic_error` – Exagon Jan 21 '17 at 16:48

2 Answers2

4

I've found the cause of the bug.

The bug is with the fact that the expect directive takes it subject parser by value, which is before the parser::impl::identifier initializer runs.

To visualize, imagine the static initializer for parser::impl::enum_parser running before parser::impl::identifier. This is valid for a compiler to do.

The copy, therefore, has an uninitialized name field, which fails as soon as the expectation point tries to construct the x3::expectation_failure with the which_ member, because constructing a std::string from a nullptr is illegal.

All in all, I fear the root cause here is Static Initialization Order Fiasco. I'll see whether I can fix it and submit a PR.

WORKAROUND:

An immediate workaround is to list the order of the source files in reverse, so that use comes after definition:

set(SOURCE_FILES 
    identifier.cpp
    enum.cpp 
    main.cpp 
)

Note that if this fixes it on your compiler (it does on mine) that is implementation defined. The standard does NOT specify the order of static initialization across compilation units.

sehe
  • 374,641
  • 47
  • 450
  • 633
  • Added a workaround that works on my machine (with clang and gcc) – sehe Jan 21 '17 at 22:45
  • 1
    Yeah. This was a tricky one. As you can see, it needed to replicate specific build steps in order to be reproducible :) – sehe Jan 21 '17 at 22:47
  • In my project I sometimes came across this std::logic_error, but was never able so isolate the bug until now... Probably because the changing of the source order fixes it on gcc and clang. I am happy it worked this time – Exagon Jan 21 '17 at 22:51
  • And here is the Pull Request: https://github.com/boostorg/spirit/pull/229 It doesn't magically "fix" the issue, but will help noticing the situation in debug builds. – sehe Jan 22 '17 at 00:03
  • So there isnt anything except changing the source order to avoid this? No solution which is conform with the standard? – Exagon Jan 22 '17 at 11:51
  • You can visit the link on "fiasco" to find ideas. All approaches are clumsy and require all samples to be fixed for it. Ultimo it still comes down to the library user doing the right thing. – sehe Jan 22 '17 at 11:54
  • I think the best we can hope to achieve is an extended version of the registration macros, that hides the pitfall. (But I'm already not a fan of the macros) – sehe Jan 22 '17 at 11:56
  • Fyi I am facing this on linux `g++ (GCC) 8.3.0`: `/usr/include/boost/spirit/home/x3/nonterminal/rule.hpp:163: std::__cxx11::string boost::spirit::x3::get_info >::type>::operator()(const T&) const [with T = boost::spirit::x3::rule; typename boost::enable_if >::type = void; std::__cxx11::string = std::__cxx11::basic_string]: Assertion '(r.name)&&("uninitialized rule")' failed. Aborted (core dumped)` Works fine on my mac. The workaround doesn't help – Matt Jun 06 '19 at 19:47
  • @Matt what version of boost? – sehe Jun 07 '19 at 10:36
  • @sehe Boost 1.70.0. You may have noticed how I was able to work around it in the answer below. Btw I really like spirit x3 and appreciate your supporting it. – Matt Jun 07 '19 at 15:51
2

Here's a solution that worked for me, when the above workround does not.

Say you have files a.cpp, a.h, a_def.hpp, b.cpp, b.h, b_def.hpp, ... as recommended by the Boost.Spirit X3 docs.

The basic idea is to combine the *.cpp and *_def.hpp files into 1 file for each group. The *.h files can and should remain.

  1. ls *_def.hpp > parser_def.hpp (assuming parser_def.hpp doesn't already exist) and edit parser_def.hpp to #include the files in the correct order. Remove redundant lines (add a header guard, etc.) The goal is for parser_def.hpp to include the other files in the correct order.
  2. cat *.cpp > parser.cpp and edit parser.cpp to be syntactically correct, and replace all #include <*_def.hpp> lines with a single #include <parser_def.hpp> at the top. In your build file (e.g. make or cmake), replace the compilation of the *.cpp files with the single parser.cpp.
  3. You can get rid of the old *.cpp files.

You lose the convenience of separately compiled files, but it will avoid the Static Initialization Order Fiasco.

Matt
  • 20,108
  • 1
  • 57
  • 70