2

A regular expression like /\s(foo|bar|baz)\s.*/ would match the following string:

football bartender bazooka baz to the end
                          ^^^^^^^^^^^^^^^

Is it possible to make a Parsing Expression Grammar rules that would parse the string in a similar fashion, splitting it into a Head and Tail?

Result <- Head Tail

football bartender bazooka baz to the end
         Head             |    Tail
dimus
  • 8,712
  • 10
  • 45
  • 56
  • 1
    Have you tried it? What challenges did you find? – Bergi Mar 14 '22 at 23:10
  • 1
    As I understand the Head rule should accept any word except `foo`, `bar`, `baz`. I did not succeed in figuring out how to create such a rule. For characters I could use something like `[^,;]`, but inverse lookahead like `!(Space (foo / bar / baz) Space)` or `!(Tail)` did not work for me. – dimus Mar 14 '22 at 23:32
  • You could use`(.*)(\s(foo|bar|baz)\s.*)` and then use the contents of the 2 capturing group as head and tail? If the regex supports it you can use named capture groups with `(?P.*)(?P\s(foo|bar|baz)\s.*)` –  Mar 15 '22 at 10:02

1 Answers1

1

Yes, it's achievable using PEG. Here's an example using pegjs:

start = f:words space r:tail
{
   return [f, r];
}

tail = f:"baz" space r:words
{
   return r;
}

words = f:word r:(space word)*
{
   return [f].concat(r).flat().filter(n => n);
}

word = !tail w:$([A-Za-z]+)
{
   return w;
}

space = " "
{
   return;
}

Output:

[
   [
      "football",
      "bartender",
      "bazooka"
   ],
   [
      "to",
      "the",
      "end"
   ]
]
Josh Voigts
  • 4,114
  • 1
  • 18
  • 43