4

Im trying to understand how to use PEG.js for simple search/replace in a text. Surely this is not the intended use for a parser but anyway Im curious about the logic behind these kind of languages to produce some search/replace.

The problem that Im having is that is hard to define positively the complementary of some definitions. An example: imagine I want to search and replace something like this syntax:

rule = (whatever_is_not_my_syntax* m:my_syntax)+ {replace m}
word = [a-z0-9_]+
my_syntax = word "." word
whatever_is_not_my_syntax = ???

It is hard to describe, positively, what is whatever_is_not_my_syntax in PEG.js without partial collision (and consequent parser error) with my_syntax, or at least I dont know how to do it, because the only negative parser functions on PEG.js are !expression and [^characters].

Can you help me? I will appreciate any book or bibliography, if exists, about this topic. Thank you in advance.

Masacroso
  • 249
  • 2
  • 12

2 Answers2

1

You don't have to specify what's not in your syntax. First try to match your syntax, then have a fallback for anything else.

Here rule is a list of all patterns in your syntax. When it doesn't match, other will match instead.

expr =
    a:rule " " b:expr
        {return [a].concat(b)}
  / a:rule
        {return [a]}
  / a:other " " b:expr
        {return [a].concat(b)}
  / a:other
        {return [a]}

word =
    a:[a-z0-9_]+
        {return a.join("")}

rule =
    word "." word
        {return "rule1"} // Put replace logic here
  / word ":" word
        {return "rule2"} // Put replace logic here

other =
    word
        {return "other"}

You can try it online: http://pegjs.org/online

fafl
  • 7,222
  • 3
  • 27
  • 50
1

You were close, we only big difference is the order:

Instead of whatever_is_not_my_syntax* m:my_syntax should be m:my_syntax whatever_is_not_my_syntax. Note that using in this case generates error:

Line 1, column 15: Possible infinite loop when parsing (repetition used with an expression that may not consume any input).

So instead of whatever_is_not_my_syntax* just whatever_is_not_my_syntax is enough.

root = result:(rules / whatever_is_not_my_syntax)* { return result.join(""); }

rules = 
  my_syntax /
  else_syntax

else_syntax = word ":" word { return "replace2" }

my_syntax = word "." word { return "replace1" }

word = [a-z0-9_]+

whatever_is_not_my_syntax = .

Input of:

a x:y b x.y c

Will be transformed as:

"a replace2 b replace1 c"

You can try it online: http://pegjs.org/online

enter image description here

Konard
  • 2,298
  • 28
  • 21