1

I would like to parse strings with escaping rules similar to that of C. I want to keep the escapes, not decode them and recode afterwards. So I thought that *(char_('\\') >> char_ | char_ - '"') would do what I want, but it does not: it behaves as if I had written lit('\\') to discard that guy.

#define TEST(Rule) test(input, #Rule, Rule)
int main()
{
  const auto input = std::string{"\\( \\\" \\\\ \\)"};
  TEST(lexeme[*(lit('\\') >> char_ | char_ - '"')]);
  TEST(lexeme[*(char_('\\') >> char_ | char_ - '"')]);
  TEST(lexeme[*char_]);
}

gives

\( \" \\ \): lexeme[*(lit('\\') >> char_ | char_ - '"')]: ( " \ )
\( \" \\ \): lexeme[*(char_('\\') >> char_ | char_ - '"')]: ( " \ )
\( \" \\ \): lexeme[*char_]: \( \" \\ \)

The whole example is available on Coliru.

akim
  • 8,255
  • 3
  • 44
  • 60
  • This is weird. http://coliru.stacked-crooked.com/a/3365b20994ce460e - no matter how I try with look ahead it ignores all backslashes. – Xeverous Jan 08 '19 at 18:07

1 Answers1

0

Your code with the second grammar (lexeme[*(char_('\\') >> char_ | char_ - '"')]) is correct (char_('\\') should synthesize to attribute of type char and return always value '\\').

This has just been confirmed as a bug in Spirit: https://github.com/boostorg/spirit/issues/434

Xeverous
  • 973
  • 1
  • 12
  • 25