2

I am in the process of investigating PEG (Parsing Expression Grammar) parsers, and one of the topics I'm looking into is equivalence with other parsing techniques.

I found a good paper about transforming regexes into equivalent PEGs at From Regular Expressions to Parsing Expression Grammars.

I am hoping to find a similar treatment for LL(*) parsers but have as-yet come up empty-handed. It seems to me that a lot of the techniques described in 1 are also going to be applicable to the problem of LL(*) transformation, however I'm not sufficiently steeped in the formalisms to be confident of my own analysis.

Your collective help would be much appreciated!

danfuzz
  • 4,253
  • 24
  • 34

1 Answers1

1

The Wikipedia article about PEG says it all, I think. PEG does recursive descent by using clause ordering for disambiguation. In theory, the family of languages that can be parsed with recursive descent is the LL family, but, because PEG has unlimited lookahead and no ambiguity, the family should be a larger one, probably full CFG.

Every LL(k) grammar can be implemented by a recursive-descent parser with k lookahead, therefore every LL(k) grammar can be transformed to a PEG grammar by ordering the rules so those that require the longest lookahed are listed first.

This is an LL(k) grammar:

params = expr
params = expr ',' params

To make it a PEG grammar for the same language, the rules must be reordered:

params = expr ',' params
params = expr
Apalala
  • 9,017
  • 3
  • 30
  • 48
  • Thanks. I know it's possible, but what I'm looking for is someone who's worked out a sufficiently mechanical procedure for the "translation" along with a good proof out at least rationale for the correctness, similar to the regex paper I cited. (Nonetheless I'll "accept" this answer in a day or two as thanks.) – danfuzz Nov 18 '12 at 21:13
  • @danfuzz Sorry for the silence, I hadn't checked in. I don't quite understand what you're looking for. It should be enough that any LL(k) grammar is also a PEG grammar (PEG > LL). A CFG that is not transformable to LL or LR is not deterministically treatable in conventional ways. PEG should be smaller than CFG (like LL and LR are) because the rule ordering won't allow for all the derivations that a CFG allows. But most practical applications are LL, so PEG should be fine. Sorry for the rumbling. I can help more if I know what you're after. – Apalala Nov 26 '12 at 01:58