1

Edit (to generalize the problem):

I'd like to parse a grammar, where

<prefix> ::= [a-z]*
<middle> ::= xxx
<suffix> ::= b+
<grammar> ::= <prefix><middle><suffix>

I expect (for example) the following words to pass: aaaaxxxbb, axxxaaxxxbbb, xxxxxxbb

Original post:

I expected the following parser to backtrack and find a solution in the end:

val before = P(AnyChar.rep.!)
val content = P("xxx".!)
val after = P("b".rep.!)
val all = P(before ~ content ~ after ~ End)
def test() = {
  val r = all.parse("aaaaxxxbbb")
  println(r)
}

Instead it looks like the before part greedily parses all the text, and the parser fails without backtracking.

Am I missing something?

Gabor Juhasz
  • 317
  • 2
  • 9
  • 1
    Since you've defined `before` in such a way that it parses any text, you shouldn't be surprised. – jub0bs Oct 11 '16 at 21:53
  • But why doesn't it backtracks? Or to put it in another way: How to define a parser that parses `aaaaxxxbbb` as well as `xxxxxxxbbb`? (But keeping in mind, that I might have multiple "keyword" for content in the future other than `xxx`, and I wouldn't want to list them all if possible) – Gabor Juhasz Oct 11 '16 at 22:07
  • 2
    Why should it backtrack? It doesn't fail! – Jörg W Mittag Oct 11 '16 at 23:34

1 Answers1

0

I was able to solve the issue in the end.

It's reasonable, that the parser won't backtrack within the regex, so I thought I should rewrite the AnyChar.rep part as a recursive rule, like this:

val before: P[Any] = P(AnyChar | (AnyChar ~ before))

But that wasn't enough, fastparse still don't seem to backtrack.

I stumbled upon this question about parsing ambiguous grammar. So I tried using GLL combinators instead of Fastparse, and that made it work.

object TestParser1 extends Parsers with RegexParsers {

  lazy val before: Parser[String] = (".".r | (".".r ~ before)) ^^ {
    case a ~ b => a.toString + b.toString
    case x => x.toString
  }
  lazy val content: Parser[String] = "xxx"
  lazy val after: Parser[String] = "b+".r
  lazy val all: Parser[String] = before ~ content ~ after ^^ {
    case (b, c, a) => s"$b $c $a"
  }

  def test() = {
    val r = all("aaaaxxxbbb")
    r.toList foreach println
  }

}
Community
  • 1
  • 1
Gabor Juhasz
  • 317
  • 2
  • 9