3

I had a parser that worked well in Scala Packrat parser combinators. I would like to try something faster with the Fastparse library. However, it cannot handle left-recursion infinite loops. Is there any standard way to cope with that?

sealed trait Expr

case class Num(value: java.lang.Number) extends Expr

case class Div(a: Expr, b: Expr) extends Expr

def num[_: P] = P(CharIn("0-9").rep(1).!).map(n => Num(n.toInt))

def div[_: P] = P(expr ~ "/" ~ expr).map(Div.tupled)

def expr[_: P]: P[Expr] = P(div | num)
Mario Galic
  • 47,285
  • 6
  • 56
  • 98
dawid
  • 663
  • 6
  • 12
  • 1
    The general answer is: rewrite your code so that recursion won't be the first choice to try out among several possibilities. I would start out with replacing `P(div | num)` with `P(num | div)` and if that won't help, rethinking the grammar on paper before attempting to change the code. – Mateusz Kubuszok Jul 22 '20 at 10:29

1 Answers1

4

I don't know much about Fastparse, but I'll try to answer your question nevertheless. Right now, your grammar looks something like this:

expr ::= div | num
div  ::= expr "/" expr
num  ::= 0 | 1 | ...

So if you wanted to parse 1/2 as an expression, it would first try to match div. To do that, it would try to match expr again, and basically go on infinitely. We can fix this by putting num before div, as suggested in a comment above:

expr ::= num | div
Or
def expr[_: P]: P[Expr] = P(num | div)

Success! Or is it? Upon looking more closely at the result, you'll see that it's not a Div(Num(1), Num(2)) but rather just a Num(1). To fix this, use End

def expr[_: P]: P[Expr] = P((num | div) ~ End)

And now it fails, saying it found "/2". It successfully matches num first, so it has no reason to think that that first number is part of a division operation. So we will have to use div before num after all, to make sure the bigger pattern is used, but something needs to be done to avoid recursion. We can refactor it like this:

expr ::= div
div  ::= num ("/" num)*

div doesn't just match division, it can also match a single number, but it tries to match division when possible. In Scala, that would be:

def div[_: P] = P(num ~ ("/" ~/ num).rep).map {
  case (a, ops) => ops.foldLeft(a: Expr){ case (a, b) => Div(a, b) }
}

def expr[_: P]: P[Expr] = P(div ~ End)

This way, we can match "1/2", "1/2/3", "1/2/3/4", etc.

Output for parse("1/2/3/4", expr(_)) is Parsed.Success(Div(Div(Div(Num(1),Num(2)),Num(3)),Num(4)), 7)

user
  • 7,435
  • 3
  • 14
  • 44
  • I thought about a similar solution. However, I am concerned if I will have to resort to such approach all over the grammar. Isn't there a transparent solution, maybe putting some state inside `expr`? – dawid Jul 22 '20 at 16:44
  • @viyps Unfortunately, probably not. From what I can tell of Fastparse, this is the best solution. It's not that bad, though. How often do you really need left-recursion in your language? – user Jul 22 '20 at 17:42
  • Left recursion everywhere in the language, unfortunately. – dawid Aug 05 '20 at 05:17