4

I'm currently solving AOC 4th task where there is following input format, a line of numbers separated by a comma and then 5x5 matrices:

27,14,70,7,85,66,65

31 23 52 26  8
27 89 37 80 46
97 19 63 34 79
13 59 45 12 73
42 25 22  6 39

27 71 24  3  0
79 42 32 72 62
99 52 11 92 33
38 22 16 44 39
35 26 76 49 58

27 71 24 ...

I have the following parser:

{-# LANGUAGE TupleSections #-}

module Lib4Parse (parseAll) where

import Control.Applicative
import Control.Monad
import Data.Attoparsec.ByteString.Char8 (Parser, Result, char, count, decimal, eitherResult, endOfInput, endOfLine, isDigit, isEndOfLine, isSpace, many', many1, option, parse, parseOnly, parseTest, scientific, sepBy, signed, skipMany, skipMany1, skipSpace, skipWhile, space, takeTill)
import qualified Data.ByteString as B

newtype BoardEntry = MkBoardEntry (Bool, Int) deriving (Show, Eq)

newtype Board = MkBoard [[BoardEntry]] deriving (Show, Eq)

parseAll :: B.ByteString -> Maybe ([Int], [Board])
parseAll input =
  either (const Nothing) Just $
    parseOnly ((,) <$> inputNumbers <*> boards) input

inputNumbers :: Parser [Int]
inputNumbers = decimal `sepBy` char ','

boardEntry :: Parser [BoardEntry]
boardEntry = map (\x -> MkBoardEntry (False, x)) <$> (decimal `sepBy` many1 (char ' '))

board :: Parser Board
board =
  MkBoard <$> count 5 (boardEntry <* (skipWhile isSpace <|> endOfInput))

boards :: Parser [Board]
boards = takeTill isDigit >> many board

The problem is that when I use many in the boards function it hangs (when calling the parseAll function). But, in case the boards function looks like this it works:

boards :: Parser [Board]
boards = takeTill isDigit >> count 5 board

However, I want to parse all boards not just specified number of them.

Questions

What is the problem that this hangs, how could this be solved?

How to debug such parsers? - I'm unable to debug this since I don't know what the parser is doing. Any tips and tricks for debugging such parsers?

xbalaj
  • 977
  • 1
  • 8
  • 14
  • 1
    This is almost certainly because you're using `many` with a parser that can accept empty input. This will always cause an infinite loop in parsec-likes. – Carl Feb 14 '22 at 23:04

1 Answers1

4

As per @Carl's comment, the problem is that board can match empty input, so many board tries to parse infinitely many empty boards when it hits the end of input. Specifically, a board matches five boardEntrys, but boardEntry can match zero numbers (i.e., an empty string). You could either require that it match five numbers, if all rows should be that long, or you can upgrade your sepBy to sepBy1 to ensure at least one number is matched:

boardEntry :: Parser [BoardEntry]
boardEntry = map (\x -> MkBoardEntry (False, x)) <$> (decimal `sepBy1` many1 (char ' '))

My usual approach for debugging parsers is to run small pieces in isolation:

> :set -XOverloadedStrings
> parseOnly boardEntry "1 2 3 4 5\n"
Right [MkBoardEntry (False,1),...]

In this case, that didn't help that much. I happened to spot the issue in boardEntry before I got too far.

K. A. Buhr
  • 45,621
  • 3
  • 45
  • 71
  • Thanks for the response! Yeah actually I was trying those parsers individually, however, I didn't know about that intrigue of infinitely many empties :) – xbalaj Feb 14 '22 at 23:47