1

Let's assume we have three elements a b and c.

A valid expression uses these three elements (and optional whitespace).

  1. At least one of these three elements has to be present.
  2. All three elements are optional (as long as at least one of the other two elements is present, see 1).
  3. The order in which these three elements are supplied is not important.

Is there an idiomatic way to write a PEG grammar that meets these three requirements?

I played with peg.js at http://pegjs.org/online and solved (1) (lookahead) and (2), but (3) eludes me. Any suggestions?

e = &(a / b / c) (a? b? c?) 

a = 'a' _
b = 'b' _
c = 'c' _

_ = [ \t]*
user1924627
  • 349
  • 4
  • 11
  • Point 3 is met by the production `&(a / b / c) (a? b? c?)`, no? – Bart Kiers May 09 '16 at 14:41
  • No, rule (3) isn't met: due to the greediness of PEG, the supplied grammar detects `a b c`, `a b`, `a c`, `b`, `b c` and `c` but not arrangements such as `c b` or `c b a`. – user1924627 May 09 '16 at 15:50
  • It has nothing to do with greediness. It doesn't match `c b` because `(a? b? c?)` requires `b` to come before `c`. That would be true whether or not PEG were greedy; that is, it is also the case with non-PEG backtracking matchers. – rici May 09 '16 at 18:59
  • Thanks for putting this straight! – user1924627 May 09 '16 at 19:52

2 Answers2

1

Thanks to the awesomeness of peg.js, it's not too hard to provide a check function that returns true (and consumes the input) if a list of elements s is a combination of some set of elements S (no repetitions allowed). The basic idea is to compute the powerset of S and map each element of s to a prime. Each element of S is mapped to the product of primes of its corresponding elements, i.e. each element of the powerset of S is mapped to a unique number. A set s is a combination of the elements in S if and only if the product of the corresponding primes in s is among the products of primes computed from S. (I guess, there is more than one way to perform this check :-) ). Below is a solution for peg.js with 5 elements which I consider pretty efficient. (A little gotcha when using & { predicate }: the javascript inside is called with all named expressions in the arguments object, so (a / b /c /d /e)+ has to have a name such as el:(a / b /c /d /e)+ ).

{
    // array of elements (expressions)
    var data = ['a','b','c', 'd', 'e'];

    // map elements to primes
    var primemap = {
       a: 2,
       b: 3,
       c: 5,
       d: 7,
       e: 11
    };

    // powerset of an array
    function powerset(arr) {
        var ps = [ [] ];
        for (var i=0; i < arr.length; i++) {
            for (var j = 0, len = ps.length; j < len; j++) {
                ps.push(ps[j].concat(arr[i]));
            }
        }
        return ps;
    }

    // compute the product of primes corresponding to each element of an array arr
    function primeprod(arr) {
       return arr.reduce( function(p,c) { return p * primemap[c] }, 1 );  
    }

    // compute powerset and remove empty set at index 0 of the powerset
    var ps = powerset(data);
    ps.splice(0,1);
    // map elements of powerset to products of primes
    var prods = ps.map( function(el) { return primeprod(el); });

    // returns true if an arr is a combination of the elements
    function isCombination(arr) {
       return prods.indexOf(primeprod(arr)) !== -1
    }
}

expr =  exp / blankline;

exp = (el:(a / b / c / d / e)+ &{ return isCombination(Array.prototype.slice.call(arguments)[0]); } {return el; } ) rest*

a = _ a:'a' {return a; }
b = _ b:'b' {return b; }
c = _ c:'c' {return c; }
d = _ d:'d' {return d; }
e = _ e:'e' {return e; }

rest = [^abcde]

blankline =
    [ \t]* ("\n" / eof) { return []; }

_ = [ \t]*
eof = !.
user1924627
  • 349
  • 4
  • 11
  • Thanks to that code I know how to use `&{}` but why don't you use an object or a map and count occurrences, something like this: https://gist.github.com/nedzadarek/b0bf9aaaefd084be4f411a6767eee7fa ? I think it would be faster than at least 2 loops and some function calls like you used. ps. I'm not sure about ifs in my code – Darek Nędza Feb 01 '17 at 09:05
0

Really the only possibility is to list all six possible orders, because PEG does not have an "unordered permutation" operator. (And neither do traditional context free grammars, so roughly the same procedure is necessary.

For example, you could use:

a (b c? / c b?)? / b (a c? / c a?)? / c (a b? / b a?)?

but that is obviously tedious to construct for a larger number of alternatives.

It's usually easier to solve "a list of x, y, ... in any order but without repetitions" and the like by accepting an arbitrary list of x, y, ..., and then checking for repetitions in the semantic action. That not only makes the grammar easier to write, it allows for more meaningful error messages.

rici
  • 234,347
  • 28
  • 237
  • 341
  • Thanks, rici. I already had the feeling that this can't be done in the grammar. I found some workaround in peg.js (see the answer to my own question.) – user1924627 May 10 '16 at 11:35