Why is generated parser so slow?

Question

I was playing around on PEG.js in an attempt to create a parser that is able to take a string and create an object.

For example, take the string "a&b" and create:

{type:"operator",value:operator, children:[a, b]}

However, I have reached the stage where returning a result can take over 10 seconds if there are two or more nests.

The test argument I have been using is:

(All a (All a (All a b)))

The grammar does return the right answer, but takes far too long. My question is, what is causing this time delay for such a simple parse?

It is possible to try and edit the grammar online at PEG.js

My grammar is:

start = sep* All:All sep* {return All} 

All = sep* operator:"All" sep* xValue: Ex sep* pValue: Ex{return {type:"operator",value:operator, children:[xValue, pValue]}} /Ex

Ex = sep* operator:"Ex" sep* xValue: AND sep* pValue: AND {return {type:"operator",value:operator, children:[xValue, pValue]}} /AND

AND= left: Plus  sep* operator:"&" sep* right:AND {return {type:"operator", value:operator, children:[left,right]}} / Plus 

Plus = left: Equals sep* operator:"+" sep* right:Plus{return {type:"operator", value:operator, children:[left,right]}}/ Equals 

Equals = left:GEQ sep* operator:"=" sep* right:Equals{return {type:"operator", value:operator, children:[left,right]}}/GEQ

GEQ = left:implication  sep* operator:">=" sep* right:GEQ{return {type:"operator", value:operator, children:[left,right]}}/implication 

implication = left:OR sep* operator:"->" sep* right:implication{return {type:"operator", value:operator, children:[left,right]}}/OR 

OR =  left:Not  sep* operator:"|" sep* right:OR{return {type:"operator", value:operator, children:[left,right]}}/Not  

Not = sep* operator:"¬" sep* right:Suc{return {type:"operator", value:operator, children:[right]}}/Suc

Suc = sep* operator:"suc" sep* right:primary{return {type:"operator", value:operator, children:[right]}}/primary 

primary  = letter:letter{return {type:"variable", value:letter}}/ "{" sep* All:All sep* "}" {return All}/"(" sep* All:All sep* ")" {return All} 

sep = spaces:[' ',\\t] 

letter  = "false"/"0"/letters:[A-Za-z]

PEG is a backtracking parser; it will try multiple alternatives looking for one that works. You're probably backtracking to death. — Ira Baxter, Dec 30 '14 at 06:05

score 0 · Answer 1 · answered Jan 16 '15 at 22:11

I'd guess it has something to do with all of your sep*s. If you look at the examples in Bryan Ford's original PEG paper, the only rule that starts with white space is the first. He then logically breaks his grammar up so that the lexical portion (token rules) is at the bottom, each token definition followed by white space. I think it'll solve your problem, but even if it doesn't, it'll make it more readable (and probably easier to fix).

For example:

    start = SPACING All:All
    All   = operator:All_op xValue:Ex pValue: Ex
          / Ex
    Ex    = operator:Ex_op xValue:AND pValue:AND
    /* etc. */


    /* Lexical Portion */
    All_op = 'All' SPACING
    Ex_op  = 'Ex'  SPACING

    SPACING = [ \t]*

Why is generated parser so slow?

1 Answers1