0

I tried to group together two cases to reduce code duplication in the grammar:

From

string = _{("'" ~ (string_value) ~ "'") | ("\"" ~ (string_value) ~ "\"") | ("\"" ~ (string_value_escape_1) ~ "\"") | ("'" ~ (string_value_escape_2) ~ "'")}
    string_escape = @{"\\"}
    string_value = @{(!("\""|"\\"|"'") ~ ANY)*}
    string_value_escape_1 = @{((!("\""|"\\") ~ ANY)+ | string_escape ~ ("\"" | "\\"))*}
    string_value_escape_2 = @{((!("'"|"\\") ~ ANY)+ | string_escape ~ ("'" | "\\"))*}

to

string = _{("'" ~ (string_value|string_value_escape_2) ~ "'") | ("\"" ~ (string_value|string_value_escape_1) ~ "\"")}
    string_escape = @{"\\"}
    string_value = @{(!("\""|"\\"|"'") ~ ANY)*}
    string_value_escape_1 = @{((!("\""|"\\") ~ ANY)+ | string_escape ~ ("\"" | "\\"))*}
    string_value_escape_2 = @{((!("'"|"\\") ~ ANY)+ | string_escape ~ ("'" | "\\"))*}

But that caused a build error in what I was sure is a simple grouping:

   = help: message: grammar error
           
            --> 3:20
             |
           3 | string = _{("'" ~ (string_value|string_value_escape_2) ~ "'") | ("\"" ~ (string_value|string_value_escape_1) ~ "\"")}
             |                    ^----------^
             |
             = expression cannot fail; following choices cannot be reached
           
            --> 3:74
             |
           3 | string = _{("'" ~ (string_value|string_value_escape_2) ~ "'") | ("\"" ~ (string_value|string_value_escape_1) ~ "\"")}
             |                                                                          ^----------^
             |
             = expression cannot fail; following choices cannot be reached
Guy Korland
  • 9,139
  • 14
  • 59
  • 106

1 Answers1

1

string_value can potentially match the empty string (since it's an arbitrary repetition using the Kleene star *). So it can't fail, as the error message says, because no matter where you are in the input, there's always an empty string in front of you.

Thus, (string_value|string_value_escape_2) will never match string_value_escape_2, because that won't be tried until string_value fails.

rici
  • 234,347
  • 28
  • 237
  • 341
  • but it will fail right after on the ~ "' path, why in this case it doesn't back track to try match string_value_escape_2? – Guy Korland Dec 27 '22 at 10:02
  • 1
    @guy That's not how PEG works. Once a rule succeeds, that's it. If you have `A ~ B`, and `A` succeeds and then `B` fails, the concatenation fails. If you had `A ~ B / C`, then when `A ~ B` fails, it will fallback to `C`. It never "reopens" a successful match of a subrule. – rici Dec 27 '22 at 13:57