3

Is it possible to create a rule in Spirit X3 that parses a single character and generates a string?

I'd like to use this in the context of a parser for version numbers, where each numeric identifier can be either a single digit, or a non-zero digit followed by one or more digits:

auto const positive_digit = char_(L"123456789");
auto const digit = char_(L"0123456789");
auto const digits = x3::rule<class digits, std::wstring>{"digits"} = +digit;
auto const numeric_identifier = (positive_digit >> digits) | digit;

The problem I see is that the type numeric_identifier synthesizes is not compatible with a string (see full example here).

To solve this, I would need to create a rule that matches a digit and synthesizes a string. The only solution that I can think of is to use semantic actions, but this causes errors when the rule is used in a situation where backtracking is necessary (see full example here).

sehe
  • 374,641
  • 47
  • 450
  • 633
Romain Deterre
  • 546
  • 4
  • 16

2 Answers2

1

It's not completely clear to me what you're trying to do. If the goal is to valid the format of a string but parse match the input string exactly, why not use x3::raw?

E.g.

auto num_id  = x3::uint_;
auto version = x3::raw[num_id % '.'];

Now you can directly parse typical version strings into a string:

Live On Coliru

int main() {
    for (sv input : {"0", "1", "1.4", "1.6.77.0.1234",}) {
        std::string parsed;

        std::cout << "Parsing " << std::quoted(input);
         
        auto f = begin(input), l = end(input);

        if (parse(f, l, version, parsed)) {
            std::cout << " -> " << std::quoted(parsed) << "\n";
        } else {
            std::cout << " -- FAILED\n";
        }

        if (f != l) {
            std::cout << "Remaining unparsed: " << std::quoted(sv{f, l}) << "\n";
        }
    }
}

Prints

Parsing "0" -> "0"
Parsing "1" -> "1"
Parsing "1.4" -> "1.4"
Parsing "1.6.77.0.1234" -> "1.6.77.0.1234"

To add the restriction that id numbers not begin with 0 unless they're literally zero:

auto num_id  = x3::char_('0') | x3::uint_;

Of course you can be less clever or more blunt:

auto num_id
    = !x3::lit('0') >> x3::uint_
    | x3::uint_parser<unsigned, 10, 1, 1>{};

The effect would be equivalent. I like the first one a bit better.

sehe
  • 374,641
  • 47
  • 450
  • 633
  • Here's my idea of how you might use a semantic action - it's simplified a little, but maybe it gives you ideas: http://coliru.stacked-crooked.com/a/682b43d6ddff3412 – sehe Aug 19 '21 at 23:33
  • Now that I spotted the one in your second Compiler Explorer sample, here's what I'd probably write: https://godbolt.org/z/zbEE1PYEv – sehe Aug 19 '21 at 23:38
  • 1
    Thanks, this is very instructive. I was indeed trying to get the input string exactly, so `x3::raw` was a clean solution – Romain Deterre Aug 20 '21 at 17:23
0

It's tricky. I don't know if there is a better way to match a character and get a string as the parse output but one way to do it is to leverage the fact that the sequence operator >> will propagate sequences of characters as strings.

In your rule since numeric_identifier is either a string of digits or a single digit you can use the fact that the single digit will be followed by end-of-input to make a sequence that will turn it into a string:

#include <boost/spirit/home/x3.hpp>
#include <iostream>
#include <string>

namespace x3 = boost::spirit::x3;

auto const positive_digit = x3::char_("123456789");
auto const digit = x3::char_("0123456789");
auto const numeric_identifier = (digit >> x3::eoi) | (positive_digit >> (+digit)) ;


int main() {
    std::string test = "1";
    std::string numeric_identifier_str;
    bool success = x3::parse(test.begin(), test.end(), numeric_identifier, numeric_identifier_str);
    std::cout << (success ? "success" : "failure") << "   " << numeric_identifier_str << "\n";

    test = "4545631";
    numeric_identifier_str = "";
    success = x3::parse(test.begin(), test.end(), numeric_identifier, numeric_identifier_str);
    std::cout << (success ? "success" : "failure") << "   " << numeric_identifier_str;
}

yields

success   1
success   44545631
jwezorek
  • 8,592
  • 1
  • 29
  • 46