Questions tagged [attoparsec]

A fast Haskell library for parsing ByteStrings

https://github.com/bos/attoparsec

131 questions
4
votes
2 answers

Skipping first line in pipes-attoparsec

My types: data Test = Test { a :: Int, b :: Int } deriving (Show) My parser: testParser :: Parser Test testParser = do a <- decimal tab b <- decimal return $ Test a b tab = char '\t' Now in order to skip the first line, I do something…
Sibi
  • 47,472
  • 16
  • 95
  • 163
4
votes
2 answers

Skipping whitespace excluding newlines in attoparsec

Attoparsec provides the function skipSpace. This function consumes all whitespace available. How can I implement a function skipSpaceNoNewline that skips any whitespace except \n and \r\n? Note: This question intentionally shows no research effort…
Uli Köhler
  • 13,012
  • 16
  • 70
  • 120
4
votes
1 answer

Conduit with aeson / attoparsec, how to exit cleanly without exception once source has no more data

I'm using aeson / attoparsec and conduit / conduit-http connected by conduit-attoparsec to parse JSON data from a file / webserver. My problem is that my pipeline always throws this exception... ParseError {errorContexts = ["demandInput"],…
NBFGRTW
  • 459
  • 3
  • 11
4
votes
2 answers

attoparsec incorrect parsing of doubles

I am using attoparsec's built-in parsers 'double' and 'number' to parse floating point values and I get different results from different parsers. >parse number "8.918605790440055e-2" Done "" 8.918605790440054e-2 > parse double…
4
votes
0 answers

Why does attoparsec use 100 times more memory than my input file?

I have a 2.5 MB file full of floats separated by spaces (the code below can generate it for you) and want to parse it into an array with attoparsec. It is surprisingly slow, taking almost a second, and allocating a lot of memory: time…
nh2
  • 24,526
  • 11
  • 79
  • 128
4
votes
4 answers

Parse recursive data with parsec

import Data.Attoparsec.Text.Lazy import Data.Text.Lazy.Internal (Text) import Data.Text.Lazy (pack) data List a = Nil | Cons a (List a) list :: Text list = pack $ unlines [ "0" , "1" , "2" , "5" ] How can List Int parser coud be…
3
votes
1 answer

Implementing "includes" when parsing in Attoparsec

I am writing a DSL for fun. I decided to use attoparsec because I was familiar with it. I want to implement parsing of includes with relative filenames like this: include /some/dir/file.ext or URLs: include http://blah.com/my/file.ext So when I'm…
Alex
  • 8,093
  • 6
  • 49
  • 79
3
votes
2 answers

Parsing JPEG markers with attoparsec

As a project to further my knowledge and comfort with Haskell I am working towards implementing a JPEG decoder which will come in handy for future computer vision work. The first step I have chosen is to parse all "Markers" within the image. These…
nightski
  • 513
  • 2
  • 13
3
votes
1 answer

Attoparsec parse fails but shouldn't with proper backtracking

I am using Attoparsec which is said to backtrack by default. However, the following line: parseOnly (string "foo" *> many1 anyChar <* string "bar") "fooxxxbar" fails with: Left "not enough input" Why is that so? If many1 anyChar decides to parse…
Iguana Bob
  • 31
  • 2
3
votes
2 answers

Haskell : how to stop Data.Attoparsec.Char8.sepBy when input String is empty?

i've wrote the following Haskell code import Data.Attoparsec (Parser) import qualified Data.Attoparsec.Char8 as A import qualified Data.ByteString.Char8 as B someWithSep sep p = A.sepBy p sep the code is suppose to work this way : main*> A.parse…
Fopa Léon Constantin
  • 11,863
  • 8
  • 48
  • 82
3
votes
1 answer

Conditional lookahead in attoparsec

Assume there is a data structure representing a text with comments inside. data TWC = T Text TWC -- text | C Text TWC -- comment | E -- end deriving Show Thus string like "Text, {-comment-}, and something else" could be encoded as T…
3
votes
2 answers

conduit: producing memory leak

Working on some observations on a previous question (haskell-data-hashset-from-unordered-container-performance-for-large-sets) I stumbled upon a strange memory leak module Main where import System.Environment (getArgs) import…
epsilonhalbe
  • 15,637
  • 5
  • 46
  • 74
3
votes
1 answer

Parse identifiers that don't end with certain characters in attoparsec

I am stuck writing an attoparsec parser to parse what the Uniform Code for Units of Measure calls a . It's defined to be the longest sequence of characters in a certain class (that class includes all the digits 0-9) which doesn't end…
Doug McClean
  • 14,265
  • 6
  • 48
  • 70
3
votes
1 answer

How can I write a more general (but efficient) version of attoparsec's takeWhile1?

Data.Attoparsec.Text exports takeWhile and takeWhile1: takeWhile :: (Char -> Bool) -> Parser Text Consume input as long as the predicate returns True, and return the consumed input. This parser does not fail. It will return an empty string if the…
jub0bs
  • 60,866
  • 25
  • 183
  • 186
3
votes
1 answer

How can I parse fixed-length, non-delimited integers with attoparsec?

I'm trying to parse two integers from 3 characters using attoparsec. A sample input might look something like this: 341 ... which I would like to parse into: Constructor 34 1 I have two solutions that work but which are somewhat clunky: stdK ::…
Frank Wang
  • 181
  • 9
1 2
3
8 9