4

My types:

data Test = Test {
 a :: Int,
 b :: Int
} deriving (Show)

My parser:

testParser :: Parser Test
testParser = do
  a <- decimal
  tab
  b <- decimal
  return $ Test a b

tab = char '\t'

Now in order to skip the first line, I do something like this:

import qualified System.IO as IO    

parser :: Parser Test
parser = manyTill anyChar endOfLine *> testParser

main = IO.withFile testFile IO.ReadMode $ \testHandle -> runEffect $
         for (parsed (parser <* endOfLine) (fromHandle testHandle)) (lift . print)

But the above parser function makes every alternate link skip (which is obvious). How to only skip the first line in such a way that it works with Pipes ecosystem (Producer should produce a single Test value.) This is one obvious solution which I don't want (the below code will only work if I modify testParser to read newlines) because it returns the entire [Test] instead of a single value:

tests :: Parser [Test]
tests = manyTill anyChar endOfLine *>
        many1 testParser

Any ideas to tackle this problem ?

Sibi
  • 47,472
  • 16
  • 95
  • 163
  • By the way, you switch between `Test` and `Link`. – Zeta Jul 10 '14 at 12:55
  • @Zeta Sorry, that's my mistake. Updated to make it `Test`. (My original data structure is actually `Link` which has more fields. I just simplified it to `Test` for this question.) – Sibi Jul 10 '14 at 14:05

2 Answers2

5

If the first line doesn't contain any valid Test, you can use Either () Test in order to handle it:

parserEither :: Parser (Either () Test)
parserEither = Right <$> testParser <* endOfLine 
           <|> Left <$> (manyTill anyChar endOfLine *> pure ())

After this you can use the functions provided by Pipes.Prelude to get rid of the first result (and additionally of all non-parseable lines):

producer p = parsed parserEither p 
         >-> P.drop 1 
         >-> P.filter (either (const False) (const True))
         >-> P.map    (\(Right x) -> x)

main = IO.withFile testFile IO.ReadMode $ \testHandle -> runEffect $
         for (producer (fromHandle testHandle)) (lift . print)
Zeta
  • 103,620
  • 13
  • 194
  • 236
5

You can drop the first line efficiently in constant space like this:

import Lens.Family (over)
import Pipes.Group (drops)
import Pipes.ByteString (lines)
import Prelude hiding (lines)

dropLine :: Monad m => Producer ByteString m r -> Producer ByteString m r
dropLine = over lines (drops 1)

You can apply dropLine to your Producer before you parse the Producer, like this:

main = IO.withFile testFile IO.ReadMode $ \testHandle -> runEffect $
    let p = dropLine (fromHandle testHandle)
    for (parsed (parser <* endOfLine) p) (lift . print)
Gabriella Gonzalez
  • 34,863
  • 3
  • 77
  • 135
  • What is don't want to drop a line, but just await lines? Is there a way better than `over lines (drops o)`? –  Feb 16 '15 at 14:27
  • @Igor With the exception of `Pipes.Prelude`, the `pipes` ecosystem discourages reading entire lines into memory since they could be arbitrarily long. To see how to do this idiomatically, study the [Pipes.Group tutorial](http://hackage.haskell.org/package/pipes-group-1.0.1/docs/Pipes-Group-Tutorial.html) and check out [Pipes.Text.lines](http://hackage.haskell.org/package/pipes-text-0.0.0.15/docs/Pipes-Text.html#v:lines) – Gabriella Gonzalez Feb 16 '15 at 20:04