1

In "Packrat Parsing: a Practical Linear-Time Algorithm with Backtracking" on page 30 the author states that the context-free grammar (CFG):

S -> a S a | a S b | b S a | b S b | a

appears not to have a corresponding parsing expression grammar (PEG).

The above CFG is equivalent to:

S -> (a | b) S (a | b) | a

and can be summarized as "odd number of a's and b's with an 'a' in the middle". However the strait-forward translation of this to a PEG:

S <- (a / b) S (a / b) / a

seems to work fine and code for the same language.

You can try this out yourself online using peg.js (enter the grammar as S = ('a' / 'b') S ('a' / 'b') / 'a').

Is the author wrong or am I misunderstanding something?

hkBst
  • 2,818
  • 10
  • 29

1 Answers1

2

You just didn't test enough. Try inputs consisting of an odd number of as. All match the grammar but PEG only accepts those of length 2k−1 for some integer k.

rici
  • 234,347
  • 28
  • 237
  • 341
  • Thanks! It appears you are correct and length 5 is indeed the first odd number for which it fails, and then also for 9, 11, 13, but 7 and 15 work. Could you explain why the PEG only matches those specific lengths? – hkBst Jan 12 '22 at 08:20
  • 1
    @hkBst: Not easily :-). The best way to try to understand it, in my opinion, is to grab a big block of paper and do the match by hand. It won't take *that* long. I did recently try to explain a very similar issue from exercise 4.4.5 in the Dragon book about recursive descent with backtracking. See https://cs.stackexchange.com/questions/143480/dragon-book-4-4-5-exercise/143975, so that might be of some help. (If you have an earlier edition of the Dragon book, it's exercise 4.13.) – rici Jan 13 '22 at 02:23
  • 1
    Also: https://stackoverflow.com/questions/17456994/how-does-backtracking-affect-the-language-recognized-by-a-parser – rici Jan 13 '22 at 02:29