0

I'm new to PEGjs and I'm trying to write a PEGjs grammar convert the RegEx (\s*[\(])|(\s*[\)])|(\"[^\(\)]+?\")|([^\(\)\s]+) to grammar.

Basically what I'm trying to do is transform the test input

(App= smtp AND "SPort" != 25) OR (App= pop3 AND "SPort" != 110) OR (App = imap AND "SPort" != 143) AND (App= imap OR "SPort" != 143)

to a json format as below

{
  "eventTypes": [
    "All"
  ],
  "condition": {
    "operator": "and",
    "terms": [
      {
        "operator": "or",
        "terms": [
          {
            "operator": "or",
            "terms": [
              {
                "operator": "and",
                "terms": [
                  {
                    "name": "App",
                    "operator": "equals",
                    "value": "smtp"
                  },
                  {
                    "name": "Sport",
                    "operator": "notEquals",
                    "value": "25"
                  }
                ]
              },
              {
                "operator": "and",
                "terms": [
                  {
                    "name": "App",
                    "operator": "equals",
                    "value": "pop3"
                  },
                  {
                    "name": "Sport",
                    "operator": "notEquals",
                    "value": "110"
                  }
                ]
              }
            ]
          },
          {
            "operator": "and",
            "terms": [
              {
                "name": "App",
                "operator": "equals",
                "value": "imap"
              },
              {
                "name": "Sport",
                "operator": "notEquals",
                "value": "143"
              }
            ]
          }
        ]
      },
      {
        "operator": "or",
        "terms": [
          {
            "name": "App",
            "operator": "equals",
            "value": "imap"
          },
          {
            "name": "Sport",
            "operator": "notEquals",
            "value": "143"
          }
        ]
      }
    ]
  }
}

I have written a bit complex javascript code to transform the sample input to the JSON format show about but the code is bit complicated and not easy to maintain in the long term so I thought to give a try a grammar parser. Since I'm new to grammar world, I seek some help or guidance to implement a grammar that does the above so I can enhance/write as needed?

You can see the output of the Regex here

EDIT

Javascript solution:

 var str = '((Application = smtp AND "Server Port" != 25) AND (Application = smtp AND "Server Port" != 25)) OR (Application = pop3 AND "Server Port" != 110) OR (Application = imap AND "Server Port" != 143) AND (Application = imap OR "Server Port" != 143)';

var final = str.replace(/\((?!\()/g,"['")        //replace ( with [' if it's not preceded with (
           .replace(/\(/g,"[")               //replace ( with [
           .replace(/\)/g,"']")              //replace ) with '] 
           .replace(/\sAND\s/g,"','AND','")  //replace AND with ','AND','
           .replace(/\sOR\s/g,"','OR','")    //replace OR with ','OR','
           .replace(/'\[/g,"[")              //replace '[ with [
           .replace(/\]'/g,"]")              //replace ]' with ]
           .replace(/"/g,"\\\"")             //escape double quotes
           .replace(/'/g,"\"");              //replace ' with "
console.log(JSON.parse("["+final+"]"))
Renato Gama
  • 16,431
  • 12
  • 58
  • 92
Reddy
  • 1,403
  • 3
  • 14
  • 27
  • I created a module, based on peg.js, which does something that's at least very similar to what you want: https://github.com/voxpelli/node-fulfills – VoxPelli Sep 02 '19 at 15:22

1 Answers1

2

To the best of my knowledge, you cannot get exactly the result you want because it would require an infinite loop. Specifically, given the following input:

A OR B OR C

You are asking for this output:

(A OR B) OR C

To get this result, you'd need to have a rule like this:

BOOL = left:( BOOL / Expression ) "OR" right:( Expression )

This creates an infinite loop, as BOOL can never be resolved. BOOL cannot be resolved because the first rule in BOOL is to match BOOL. However, we can get

A OR ( B OR C )

because

BOOL = left:( Expression ) "OR" right:( BOOL / Expression )

does not create an infinite loop. This is because we can begin to match something before recursing back into BOOL. It's a little heady, I know, but trust me... you've got to have something for PegJS to start matching before you can recurse.

If this is acceptable, then I believe this grammar would get you pretty close to the desired output:

// Our top-level rule is Expression
Expression
  = BOOL
  / SubExpression
  / Comparison
  / Term

// A sub expression is just an expression wrapped in parentheses
// Note that this does not cause an infinite loop because the first term is always "("
SubExpression
  = _ "(" _ innards: Expression _ ")" _ { return innards; }

Comparison
  = name:Term _ operator:("=" / "!=") _ value:Term {
      return {
        name: name,
        operator: operator === '=' ? 'equals' : 'notEquals',
        value: value,
      };
    }

BOOL = AND / OR

// We separate the AND and OR because we want AND to take precendence over OR
AND
  = _ left:( OR / SubExpression / Comparison ) _ "AND" _ right:( AND / OR / SubExpression / Comparison ) _ {
    return {
      operator: 'and',
      terms: [ left, right ]
    }
  }

OR
  = _ left:( SubExpression / Comparison ) _ "OR" _ right:( OR / SubExpression / Comparison ) _ {
    return {
      operator: 'or',
      terms: [ left, right ]
    }
  }

Term
  = '"'? value:$( [0-9a-zA-Z]+ ) '"'? {
      return value;
    }

Integer "integer"
  = _ [0-9]+ { return parseInt(text(), 10); }

_ "whitespace"
  = [ \t\n\r]*

Given your input, we'd get:

{
   "operator": "and",
   "terms": [
      {
         "operator": "or",
         "terms": [
            {
               "operator": "and",
               "terms": [
                  {
                     "name": "App",
                     "operator": "equals",
                     "value": "smtp"
                  },
                  {
                     "name": "SPort",
                     "operator": "notEquals",
                     "value": "25"
                  }
               ]
            },
            {
               "operator": "or",
               "terms": [
                  {
                     "operator": "and",
                     "terms": [
                        {
                           "name": "App",
                           "operator": "equals",
                           "value": "pop3"
                        },
                        {
                           "name": "SPort",
                           "operator": "notEquals",
                           "value": "110"
                        }
                     ]
                  },
                  {
                     "operator": "and",
                     "terms": [
                        {
                           "name": "App",
                           "operator": "equals",
                           "value": "imap"
                        },
                        {
                           "name": "SPort",
                           "operator": "notEquals",
                           "value": "143"
                        }
                     ]
                  }
               ]
            }
         ]
      },
      {
         "operator": "or",
         "terms": [
            {
               "name": "App",
               "operator": "equals",
               "value": "imap"
            },
            {
               "name": "SPort",
               "operator": "notEquals",
               "value": "143"
            }
         ]
      }
   ]
}
JDB
  • 25,172
  • 5
  • 72
  • 123