In Haskell, performance and where binding

Question

I am just learning Haskell and wrote two programs from a tutorial site, such that

maximumnowhere :: (Ord a) => [a] -> a
maximumnowhere [] = error "empty"
maximumnowhere [x] = x
maximumnowhere (x:xs) = if x > maximumnowhere xs then x else maximumnowhere xs

and

maximumwhere :: (Ord a) => [a] -> a
maximumwhere [] = error "empty"
maximumwhere [x] = x
maximumwhere (x:xs) = if x > maximum' then x else maximum' where maximum' = maximumwhere xs

I thought these two programs are fairly equivalent, cause I thought, the where binding only replaces the variable with its content. but when I run it in ghci, the first one was way slower than the latter, especially for an array with length over 25. Probably, the where binding makes this huge performance difference, but I don't know why. Can anyone explain it for me?

The first one doesn't share evaluations of `maximumnowhere xs` (used in both the if conditional and the else case) - if you want sharing you should do it yourself as per the second version. — stephen tetley, Dec 31 '11 at 14:48
Adding further info, GHC generally doesn't do common subexpression elimination (which would make both versions perform the same). This is because CSE can introduce space leaks in a lazy language - see the GHC FAQ - http://www.haskell.org/haskellwiki/GHC:FAQ#Does_GHC_do_common_subexpression_elimination.3F — stephen tetley, Dec 31 '11 at 14:52
Why do people use ghci for performance measurements? There is an optimizing compiler you can test with.. — Thomas M. DuBuisson, Dec 31 '11 at 15:08
Thank you guys. Like I said above, I'm just learning Haskell now and didn't know about compiler things or optimization. — user1124390, Dec 31 '11 at 15:15
@ThomasM.DuBuisson , I tested the two functions with `ghc 7.0.3` and `-O2`. It does not optimize away the difference. — HaskellElephant, Dec 31 '11 at 16:00
Haskellelephant: actually, I said what I mentioned. It might not help in this case but people have been post lots of "haskell in ghci underperforms" type questions. — Thomas M. DuBuisson, Dec 31 '11 at 16:31

score 14 · Answer 1 · answered Dec 31 '11 at 16:20

No, they are not equivalent. let and where introduce sharing, which means that the value is only evaluated once. The compiler will in general not share the result of two identical expressions unless you tell it to, because it cannot in general tell on its own whether the space-time trade-off of doing this is beneficial or not.

Thus, your first program will do two recursive calls per iteration, making it O(2^n), while the second only does one per iteration, i.e. O(n). The difference between these is huge. At n = 25, the first program results in over 33 million recursive calls while the second only does 25.

So the moral of the story is that if you want sharing, you need to ask for it by using let or where.

+1 Nice answer. Due to Haskell's purity, we often emphasize equational reasoning, but for performant Haskell, it is important to know what assumptions the compiler is making. (In this case, GHC generally expects the programmer to indicate sharing explicitly.) — Dan Burton, Dec 31 '11 at 20:16

In Haskell, performance and where binding

1 Answers1

Linked