Consider the following sample text line:
"Hello : World 2020 :tag1:tag2:tag3"
I want to design a spirit X3 parser that can extract:
- Content := "Hello : world 2020 "
- Tags := { tag1,tag2,tag3 }
The problem: Content is defined as leftover char sequence(excluding eol) after matching the tags and I am not sure how to write a rule that can synthesize two attributes: one representing the extracted tags and another representing leftover characters(the content)
So far I've written the rule for extracting the tags:
...
namespace ast {
struct sample {
std::u32string content;
std::vector<std::u32string> tags;
};
//BOOST FUSION STUFF .....
}
namespace grammar {
using x3 = boost::spirit::x3;
using x3::unicode::lit;
using x3::unicode::char_;
using x3::unicode::alnum;
auto const tag
= x3::rule<class tag_class, std::u32string> {"tag"}
%=
lit(U":")
>>
+(alnum | lit(U"_") | lit(U"@") | lit(U"#") | lit(U"%") )
;
auto const tags
= x3::rule<class tags_class, std::vector<std::u32string>{"tags"}
%= +tag >> lit(U":");
}
But stuck over here:
auto const sample_rule =
= x3::rule<class sample_rule_class, ast::sample> {"sample"}
= ?? // something like (+char_ - (eol|tags);