Haskell quickBatch: Testing ZipList Monoid at mconcat results in stack overflow

Question

i've created orphaned instances for ZipList Semigroup and Monoid. However, when I run the tests from quickBatch on monoid, at the mconcat test, there is a stack overflow error. How do I resolve this error? Why is there such an error? Is it due to pure mempty, which I do not quite understand as I got this mostly from HaskellBook Chapter 17 Applicative section 17.8 ZipList Monoid?

zl :: ZipList (Sum Int)
zl = ZipList [1,1 :: Sum Int]
instance Semigroup a 
  => Semigroup (ZipList a) where
    (<>) = liftA2 (<>)
instance (Eq a, Monoid a)
  => Monoid (ZipList a) where
    mempty = pure mempty 
    mappend = (<>)
    mconcat as = 
      foldr mappend mempty as
main :: IO ()
main = do 
  quickBatch $ monoid zl

`mempty :: ZipList a` is an infinite list: I wonder if it is trying to do something like `mempty <> mempty` (though it seems odd it would try that for `mconcat` but not `mappend` itself). — chepner, Jan 14 '21 at 15:47
@chepner thanks for the hint, but i'm still confused over the use of `pure mempty`, what's `mempty` supposed to be or do here? — maxloo, Jan 15 '21 at 14:03
For `ZipList`, `(<>)` simply builds a `ZipList` by using `(<>)` on corresponding elements of its two arguments. That is, `ZipList [a,b,c] <> ZipList [x, y ,z] == ZipList [a<>x, b<>y, c<>z]`. Since `mempty :: ZipList a` has to be able to combine with *any* `ZipList` value, it has to be an infinite list, so that `(<>)` never runs out of elements to pair with the other list. My assumption is that somehow `mconcat` is attempting to iterate over the "entire" value of `mempty`, leading to infinite recursion and a stack overflow, but I'm not sure how. — chepner, Jan 15 '21 at 14:32
If you are asking about `mempty = pure mempty`, it just means that given `mempty :: a`, you call `pure` on it to get a `ZipList a`, and `pure :: a -> ZipList a` creates an infinite list of `a`s. — chepner, Jan 15 '21 at 14:33
@chepner Thanks for shining some light on this! I'm now also looking at https://stackoverflow.com/questions/50130388/ziplist-monoid-haskell, I'm hoping that I can use that with Arbitrary and quickBatch, such that the monoid mconcat problem can be resolved.. — maxloo, Jan 15 '21 at 14:45

DDub · Accepted Answer · 2021-01-18T16:10:47.447

Yes, the error is due to pure mempty, but that doesn't mean pure mempty is wrong. Let's look there first.

It helps a lot to look at the types involved in the definition mempty = pure mempty:

mempty :: ZipList a
mempty = (pure :: a -> ZipList a) (mempty :: a)

Basically, we're going to use the pure operation to create a ZipList out of the mempty of type a. It helps from here to look at the definition of pure for ZipList:

pure :: a -> ZipList a
pure x = ZipList (repeat x)

In total, mempty for ZipList a is going to be a ZipList containing the infinitely repeating list of mempty values of the underlying type a.

Back to this error you're getting. When you try to run the test monoid over ZipList (Sum Int), QuickCheck is going to test a sequence of properties.

The first two check the left identity and right identity properties. What these do is generate values of type x :: ZipList (Sum Int) and verify that x <> mempty = mempty <> x = x.
The third checks that for any two values x, y :: ZipList (Sum Int), we have that x mappend y = x <> y.
The fourth checks that for any list of values x :: [ZipList (Sum Int)], folding these with mappend is the same as mconcating them.

Before I continue, it's really important to note that when I say "for any value", I really mean that QuickCheck is using the Arbitrary instance of the said type to generate values of that type. Furthermore, the Arbitrary instance for ZipList a is the same as the Arbitrary instance for [a] but then wrapped in ZipList. Lastly, the Arbitrary instance for [a] will never produce an infinite list (because those will cause problems when you're checking for equality, like going into an infinite loop or overflowing the stack), so these "for any values" of type ZipList (Sum Int) will never be infinite either.

Specifically, this means that QuickCheck will never arbitrarily generate the value mempty :: ZipList a because this is an infinite list.

So why do the first 3 pass but the last one fails with a stack overflow? In the first three tests, we never end up trying to compare an infinite list to an infinite list. Let's see why not.

In the first two tests, we're looking at x <> mempty == x and mempty <> x == x. In both cases, x is one of our "arbitrary" values, which will never be infinite, so this equality will never go into an infinite loop.
In the third test, we're generating two finite ZipLists x and y and mappending them together. Nothing about this will be infinite.
In the third case, we're generating a list of ZipLists and mconcatenating the list. But, what happens if the list is empty? Well, mconcat [] = mempty, and folding an empty list produces mempty. This means, if the empty list is generated as the arbitrary input (which is perfectly possible), then the test will try to confirm that an infinite list is equal to another infinite list, which will always result in a stack overflow or black hole.

How can you fix this? I can come up with two methods:

You can define your own version of EqProp for ZipList so that it only compares equality on some finite prefix of the list. This would likely involve making a newtype wrapper (perhaps newtype MonZipList a = MonZipList (ZipList a)), deriving a bunch of instances, and then writing an EqProp one by hand. This will probably work but is a little inelegant.
You can write your own version of monoid that uses a different version of the fourth test. For instance, if you restrict it so that the test only uses non-empty lists, then you won't have any problem. To do this, you should start by looking at the definition of the monoid property tests. Notice that it currently defines the "mconcat" property as property mconcatP where

mconcatP :: [a] -> Property
mconcatP as = mconcat as =-= foldr mappend mempty as

Using QuickCheck's own NonEmptyList class, you can rewrite this for your purposes as:

mconcatP :: NonEmptyList a -> Property
mconcatP (NonEmptyList as) = mconcat as =-= foldr mappend mempty as

Obviously, this is a slightly weaker condition, but at least it's one that won't hang.

Thanks! Could you elaborate more about your solutions? For your first solution, how do I create a quickBatch test for a finite prefix of ZipList? — maxloo, Jan 18 '21 at 15:36
For your second solution, I've tried to restrict the mconcat test by defining in `instance (Eq a, Monoid a) => Monoid (ZipList a)`, `mconcat as = if as /= [] then foldr mappend mempty as else ZipList []`, but it didn't work because the mconcat test fails at first try.. — maxloo, Jan 18 '21 at 15:40
I updated the question with some suggestions about how to make the changes I suggested. As for your alternate `moncat` definition, this will fail the test because the test still expects the result of `mconcat []` to be an infinite list. True, it's not going into an infinite loop any more, but that's because it's now obvious that `[]` is not an infinite list. — DDub, Jan 18 '21 at 16:14
Ok, I've tried QuickCheck's NonEmptyList, but if I use `mconcatP (NonEmptyList as) = ..`, I get an error `Not in scope: data constructor ‘NonEmptyList’`, and if I use `mconcatP (nonEmptyList as) = ..` or `mconcatP (getNonEmpty as) = ..`, I get `Parse error in pattern` .. — maxloo, Jan 18 '21 at 17:06
I got the idea of using `getNonEmpty` from: https://stackoverflow.com/questions/28329303/how-can-i-constrain-a-quickcheck-parameter-to-a-list-of-non-empty-strings — maxloo, Jan 18 '21 at 17:08
I also created `nonEmptyList = listOf1 (arbitrary :: Gen [Int])`, but I realised that it doesn't work with `mconcatP (nonEmptyList as) = ..`, is there something wrong with the way I'm using NonEmptyList? — maxloo, Jan 18 '21 at 17:24
Perhaps you need to import it? `import Test.QuickCheck (NonEmptyList(..))` — DDub, Jan 18 '21 at 19:53
I've already done that.. don't worry, if you do not have the answer, I'll post another question about this. I'm also having problems understanding Ap, I've posted my questions at https://stackoverflow.com/questions/65763728/haskell-quickbatch-ap-applicative-monoid. Maybe you can take a look? You may know more than enough to answer my questions because my understanding of Ap is quite poor.. — maxloo, Jan 19 '21 at 09:48

Daniel Martin · Answer 2 · 2022-06-27T19:09:06.930

As an aside, this definition of Monoid for ZipList is inconsistent with the definition of Alternative for ZipList.

I would propose instead:

instance Semigroup a => Semigroup (ZipList a) where
ZipList [] <> ZipList ys = ZipList ys
ZipList xs <> ZipList [] = ZipList xs
ZipList (x:xs) <> ZipList (y:ys) = ZipList (x <> y : getZipList (ZipList xs <> ZipList ys))

instance Semigroup a => Monoid (ZipList a) where
mempty = ZipList []

There is no formal requirement that empty in an Alternative instance be the same as mempty in Monoid, but without an extremely good reason to do so I wouldn't choose otherwise.

Haskell quickBatch: Testing ZipList Monoid at mconcat results in stack overflow

2 Answers2

Linked