3

I am using Attoparsec which is said to backtrack by default. However, the following line:

parseOnly  (string "foo" *> many1 anyChar <* string "bar") "fooxxxbar"

fails with:

Left "not enough input"

Why is that so? If many1 anyChar decides to parse only three characters (xxx) it should be successful. And it should consider doing that at some point because of backtracking, shouldn't it?

What is the proper way to do equivalent of /foo(.*)bar/ regex using Attoparsec?

Iguana Bob
  • 31
  • 2

1 Answers1

2

I am using Attoparsec which is said to backtrack by default.

Not quite. Attoparsec does support backtracking, but only in some explicit situations (where the documentation says it does). Its purpose is high-performance parsing and, unsurprisingly, that doesn't play well with backtracking.

You are looking for manyTill or manyTill'. Note that the backtracking behaviour is mentioned in the documentation.

ghci> manyTill1 p e = (:) <$> p <*> manyTill p e 
ghci> parseOnly (string "foo" *> manyTill1 anyChar (string "bar")) "fooxxxbar"
Right "xxx"
Alec
  • 31,829
  • 7
  • 67
  • 114
  • 1
    Well, the documentation explicitly says: "attoparsec parsers always backtrack on failure". What parser combinators are there with full backtracking support? Performance is not an issue for me and the proposed solution with `manyTill` does not work well for my use case. – Iguana Bob Apr 17 '17 at 00:27
  • 3
    @IguanaBob [regex-applicative](http://hackage.haskell.org/package/regex-applicative) does not backtrack but will always succeed if it is possible to. [ReadP](http://hackage.haskell.org/package/base-4.9.1.0/docs/Text-ParserCombinators-ReadP.html) has the same property and can parse more grammars. [ReadS](http://hackage.haskell.org/package/base-4.9.1.0/docs/Text-ParserCombinators-ReadP.html#t:ReadS) is basically a less efficient ReadP, and does actually backtrack -- but I consider "backtracking" to be an implementation detail for the real property you care about, which all three satisfy. – Daniel Wagner Apr 17 '17 at 01:08