0

Let me use this example:

1 2 3 / 4 5 6

should parse into:

[[1, 2, 3], [4, 5, 6]]

So I write:

p1 :: Parser (List Char)
p1 = sepBy anyDigit (char ' ')

p2 :: Parser (List (List Char))
p2 = sepBy p1 (string " / ")

Alas, this fails:

(Left Character '/' is not a digit)

Which way to go?

levant pied
  • 3,886
  • 5
  • 37
  • 56

1 Answers1

2

The problem is that the separator / starts with a space, so the first parser is committing to parsing its separator and then the next digit.

You have a few options. You could change p1 so that it explicitly looks for a space and not an operator:

sepBy anyDigit (char ' ' <* notFollowedBy (char '/'))

Alternatively, have your lexemes eagerly consume any trailing whitespace:

myDigit = anyDigit <* many whitespace

p1 = many1 myDigit
p2 = sepBy p1 (char '/' <* many whitespace)

Another option is to split your parser into an initial lexing phase, which splits the input into lexemes, removing whitespace. Then you wouldn't be able to use string-parsers, but purescript-parsing would be able to work on the stream of tokens.

Phil Freeman
  • 4,199
  • 1
  • 20
  • 15
  • Getting `Maximum call stack size exceeded` using 2nd eating trailing whitespace option. – levant pied May 30 '17 at 22:07
  • 1
    Ah, I think that might be because `whitespace` succeeds on an empty string, so `many whitspace` will loop. You could try `many (char ' ')` instead. – Phil Freeman May 30 '17 at 22:46