1

I can't understand what does the type of (for example) eol mean:

eol :: (MonadParsec e s m, Token s ~ Char) => m String

or, better, I don't understand how to use eol with Text.Megaparsec.Text and not Text.Megaparsec.String.

I've been trying to use learn how to use Megaparsec following the (old) tutorial for Parsec from Real World Haskell (I actually started reading RWH tutorial first before finding out that Megaparsec existed). I rewrote the code of the first example to use Megaparsec (see below). But I found that when I try to force the type of eol to Parser Text the compiler throws the error: Couldn't match type ‘[Char]’ with ‘Text’, what I gather from this is that I cannot use eol with Text or, more likely, I don't know how to change that Token s ~ Char context from the eol declaration to use Token Text.

{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE NoImplicitPrelude #-}

module CSVParser (
  module CSVParser
) where

import Foundation
import Data.Functor.Identity (Identity)
import Text.Megaparsec
import Text.Megaparsec.Text
import Data.Text

csvFile :: Parser [[Text]]
csvFile =
    do result <- many line
       eof
       return result

line :: Parser [Text]
line =
    do result <- cells
       --eol :: Parser Text -- uncommenting this line results in a compilation error
       eol
       return result

cells :: Parser [Text]
cells =
    do first <- cellContent
       next <- remainingCells
       return (first : next)

remainingCells =
    (char ',' >> cells)
    <|> return []

cellContent :: Parser Text
cellContent = fromList <$> many (noneOf [',','\n'])

parseCSV :: Text -> Either (ParseError (Token Text) Dec) [[Text]]
parseCSV = parse csvFile "(unknown)"
helq
  • 1,451
  • 1
  • 9
  • 12
  • Why are you writing `eol :: Parser Text` when you're ignoring its return value anyway? – Benjamin Hodgson Jun 21 '17 at 16:43
  • Well, I do it because I want to know how to change its type. I want to change its type because many other functions in the library have the same type declaration, take for example `lowerChar :: (MonadParsec e s m, Token s ~ Char) => m Char`, I may want not to ignore its return value but constraint it to be "Text" (for that I could do `fromList <$> lowerChar`, but that seems ugly, I suppose I could just change the type directly, but I don't know, or understand, how). Mainly my problem is with `(MonadParsec e s m, Token s ~ Char) => m Char`. – helq Jun 21 '17 at 17:07

1 Answers1

4

In the type:

eol :: (MonadParsec e s m, Token s ~ Char) => m String

the ~ is a type equality constraint, and the MonadParsec and Token typeclasses are defined by Megaparsec. They can roughly be interpreted as follows:

  • MonadParsec e s m is an assertion that type m is a monadic parser that reads a Stream of type s and represents errors using an ErrorComponent of type e
  • Token s is the underlying type of the tokens read from stream s

So, the full type can be interpreted as: eol is a monadic parser with "return value" String that parses a stream whose tokens are Char.

For your problem, most of this can be ignored. The issue you're running into is that eol returns a String value as the result of the parse, and a String isn't a Text, so you can't make an eol (which is of type Parser String) be of type Parser Text, no matter how hard you try.

Two solutions are to ignore the unwanted String return value or, if you need it as text, convert it:

Data.Text.pack <$> eol
K. A. Buhr
  • 45,621
  • 3
  • 45
  • 71
  • Thanks for the explanation on `~`, it really helped me. I supposed that the type `Parsec Dec Text` (written as `Parser` when importing `Megaparsec.Text`) was conditioning the OUTPUT of `eol` to be `Text`, when in reality it only enforces `Text` as the INPUT stream of `eol`, so the type of `eol` in my (code) example would be `Parsec Dec Text [Char]`. But I wanted to get `Parsec Dec Text Text` which is clearly impossible! because `eol` has type `(constraint) => m [Char] === Parsec e s [Char]`. – helq Jun 22 '17 at 15:09
  • Btw, I also learnt that `Token Text == Char`, which I wasn't expecting! because I saw `Text` and `[Char]` as different things (and they are, `[Char]` is a linked list, and `Text` something else) but never thought on their constituent parts, which are just `Char`s. It makes sense now, `Parsec Dec Text [Char]` is a totally valid type for `eol`, I thought it wasn't because `Text != [Char]`, but it is, just taking `eol :: (MonadParsec e s m, Token s ~ Char) => m [Char]` and replacing `e` for `Dec`, `s` for `Text` (which fulfills `Token s ~ Char`!) gives us a totally valid type. – helq Jun 22 '17 at 15:19