Questions tagged [attoparsec]

A fast Haskell library for parsing ByteStrings

https://github.com/bos/attoparsec

131 questions
6
votes
1 answer

Is it possible to efficiently look ahead more than one Char in Attoparsec?

I'm trying to augment Haskell's Attoparsec parser library with a function takeRegex :: Regex -> Parser ByteString using one of the regexp implementations. (Motivation: Good regex libraries can provide performance that is linear to the length of the…
Sami Liedes
  • 1,084
  • 8
  • 19
5
votes
1 answer

attoparsec-iteratee doesn't work when input is larger than buffer size

I have a simple attoparsec-based pdf parser. It works fine until used with iteratee. When size of input exceeds buffer size. import qualified Data.ByteString as BS import qualified Data.Iteratee as I import qualified Data.Attoparsec as P import…
Yuras
  • 13,856
  • 1
  • 45
  • 58
5
votes
1 answer

Making attoparsec parsers recursive

I've been coding up an attoparsec parser and have been hitting a pattern where I want to turn parsers into recursive parsers (recursively combining them with the monad bind >>= operator). So I created a function to turn a parser into a recursive…
Rehno Lindeque
  • 4,236
  • 2
  • 23
  • 31
5
votes
1 answer

understanding attoparsec

attoparsec was suggested to me for parsing a file, now I must to understand how to use it; somebody gave me this piece of code: # type Environment = M.Map String String import Data.Attoparsec (maybeResult) import qualified Data.Attoparsec.Char8 as…
arpho
  • 1,576
  • 10
  • 37
  • 57
5
votes
2 answers

Attoparsec: skipping up to (but not including) a multi-char delimiter

I have a string that can contain pretty much any character. Inside the string there is the delimiter {{{. For example: afskjdfakjsdfkjas{{{fasdf. Using attoparsec, what is the idiomatic way of writing a Parser () that skips all characters before…
danidiaz
  • 26,936
  • 4
  • 45
  • 95
5
votes
1 answer

Using sepBy string in Attoparsec

I'm trying to separate a string by either ",", ", and" and "and", and then return whatever was in between. An example of what I have so far is as follows: import Data.Attoparsec.Text sepTestParser = nameSep ((takeWhile1 $ inClass "-'a-zA-Z") <*…
oneway
  • 753
  • 1
  • 5
  • 9
5
votes
2 answers

Optimizing a simple parser which is called many times

I wrote a parser for a custom file using attoparsec. The profiling report indicated that around 67% of the memory allocation is done in a function named tab, which also consumes the most time. The tab function is pretty simple: tab :: Parser…
Sibi
  • 47,472
  • 16
  • 95
  • 163
5
votes
2 answers

Converting normal attoparsec parser code to conduit/pipe based

I have written a following parsing code using attoparsec: data Test = Test { a :: Int, b :: Int } deriving (Show) testParser :: Parser Test testParser = do a <- decimal tab b <- decimal return $ Test a b tParser :: Parser…
Sibi
  • 47,472
  • 16
  • 95
  • 163
5
votes
1 answer

Incremental Parsing from Handle in Haskell

I'm trying to interface Haskell with a command line program that has a read-eval-print loop. I'd like to put some text into an input handle, and then read from an output handle until I find a prompt (and then repeat). The reading should block…
davidsd
  • 771
  • 4
  • 18
5
votes
2 answers

"Sub-parsers" in pipes-attoparsec

I'm trying to parse binary data using pipes-attoparsec in Haskell. The reason pipes (proxies) are involved is to interleave reading with parsing to avoid high memory use for large files. Many binary formats are based on blocks (or chunks), and their…
absence
  • 821
  • 1
  • 8
  • 13
5
votes
2 answers

How do I make Attoparsec parser succeed without consuming (like parsec lookAhead)

I wrote a quick attoparsec parser to walk an aspx file and drop all the style attributes, and it's working fine except for one piece of it where I can't figure out how to make it succeed on matching > without consuming it. Here's what I…
Jimmy Hoffa
  • 5,909
  • 30
  • 53
4
votes
1 answer

Attoparsec hangs

I'm currently solving AOC 4th task where there is following input format, a line of numbers separated by a comma and then 5x5 matrices: 27,14,70,7,85,66,65 31 23 52 26 8 27 89 37 80 46 97 19 63 34 79 13 59 45 12 73 42 25 22 6 39 27 71 24 3 …
xbalaj
  • 977
  • 1
  • 8
  • 14
4
votes
2 answers

Parsing and the use of GADTs

I've ran into a problem while writing a parser. Specifically, I want to be return values of different types. For example, I have two different data types FA and PA to represent two different lipid classes - data FA = ClassLevelFA IntegerMass …
Michael T
  • 1,033
  • 9
  • 13
4
votes
2 answers

Why does this parser always fail when the end-of-line sequence is CRLF?

This simple parser is expected to parse messages of the form key: value\r\nkey: value\r\n\r\nkey: value\r\nkey: value\r\n\r\n One EOL acts as a field separator, and double EOL acts as a message separator. It works perfectly fine when the EOL…
concept3d
  • 2,248
  • 12
  • 21
4
votes
2 answers

Understanding the attoparsec implementation (part 2)

I am currently trying to study and understand the source code of the attoparsec library, but there are some details I can't figure out myself. For example, the definition of the Parser type: newtype Parser i a = Parser { runParser :: forall…
bmk
  • 1,548
  • 9
  • 17
1
2
3
8 9