Pattern matching and infinite lists

Question

I'm having trouble understanding this simple snippet of code:

-- This works:     foldr go1 [] [1..]
-- This doesn't:   foldr go2 [] [1..]

go1 a b = a : b

go2 a [] = a : []
go2 a b  = a : b

Folding with go1 immediately starts returning values, but go2 appears to be waiting for the end of the list.

Clearly the pattern matching is causing something to be handled differently. Can someone explain what exactly is going on here?

Thanks for the excellent answers everyone. If I could select multiple solutions, I would, because they all helped me grok the concept. — , Nov 02 '14 at 23:48
in addition to "selecting" an answer you can also upvote those you deem "helpful" (it says so when you hover above the up arrow). :) — Will Ness, Nov 04 '14 at 15:50

sepp2k · Accepted Answer · 2014-11-04T14:01:41.010

Unlike go1, go2 checks whether or not its second argument is empty. In order to do that the second argument needs to be evaluated, at least enough to determine whether it is empty or not.

So for your call to foldr this means the following:

Both go1 and go2 are first called with two arguments: 1 and the result of foldr go [] [2 ..]. In the case of go1 the second argument remains untouched, so the result of the foldr is simply 1 :: foldr go [] [2 ..] without evaluating the tail any further until it is accessed.

In the case of go2 however, foldr go [] [2 ..] needs to be evaluated to check whether it is empty. And to do that foldr go [] [3 ..] then needs to be evaluated for the same reason. And so on ad infinitum.

effectfully · Answer 2 · 2014-11-04T17:04:39.903

1

To test, whether an expression satisfies some pattern, you need to evaluate it to weak head normal form at least. So pattern-matching forces evaluation. One common example is the interleave function, which interleaves two lists. It could be defined like

interleave :: [a] -> [a] -> [a]
interleave  xs     []    = xs
interleave  []     ys    = ys
interleave (x:xs) (y:ys) = x : y : interleave xs ys

But this function is strict in the second argument. And more lazy version is

interleave  []    ys = ys
interleave (x:xs) ys = x : interleave ys xs

You can read more here: http://en.wikibooks.org/wiki/Haskell/Laziness

edited Nov 04 '14 at 17:04

answered Nov 02 '14 at 22:33

effectfully

12,325
2
17
40

1

if it "interleaves two lists", shouldn't it be called `interleave`? "merge" fits better with e.g. mergesort... – Will Ness Nov 04 '14 at 16:00
@Will Ness, Fixed. I've seen this function being called "merge" for a several times. – effectfully Nov 04 '14 at 17:09

score 0 · Answer 3 · answered Nov 02 '14 at 22:33

It is because of laziness.... Because of the way that go1 and go2 were defined in this example, they will behave exactly the same was for b==[], but the compiler doesn't know this.

For go1, the left-most fold will use tail-recursion to immediately output the value of a, and then compute the value of b.

go1 a b -> create and return the value of a, then calculate b

For go2, the compiler doesn't even know which case to match until the value of b is computed.... which will never happen.

go2 a b -> wait for the value of b, pattern match against it, then output a:b

score 0 · Answer 4 · answered Nov 02 '14 at 22:35

To see the difference try this in GHCi:

> head (go1 1 (error "urk!"))
1
> head (go2 1 (error "urk!"))
*** Exception: urk!

The issue is that go2 will evaluate its second argument before returning the head of the list. That is, go2 is strict in its second argument, unlike go1 which is lazy.

This matters when you fold over infinite lists:

fold1 go1 [] [1..] =
go1 1 (go1 2 (go1 3 ( ..... =
1 : (go1 2 (go1 3 ( ..... =
1 : 2 : (go1 3 ( ...

works fine, but

fold1 go1 [] [1..] =
go2 1 (go2 2 (go2 3 ( .....

can not be simplified to 1:... since go2 insists in evaluating its second argument, which is another call to go2, which in turn requires its own second argument to be evaluated, which is another ...

Well, you get the point. The second one will not halt.

Pattern matching and infinite lists

4 Answers4