22

In the following code:

ismaxl :: (Ord a) => [a] -> a -> Bool
ismaxl l x = x == maxel
           where maxel = maximum l

main = do
  let mylist = [1, 2, 3, 5]
  let ismax = ismaxl mylist
  --Is each call O(1)?  Does each call remember maxel?
  let c1 = ismax 1
  let c2 = ismax 2
  let c3 = ismax 3
  let c5 = ismax 5
  putStrLn (show [c1, c2, c3, c5])

Does the partial function ismax, compute the maxel? Speficially, can someone point to a rule about the complexity of partial functions in Haskell? MUST the compiler only call maximum once in the above example? Put another way, does a partial function keep the references of prior calls for internal where clauses?

I have some CPU-bound code that is not performing acceptably, and I'm looking for possible errors in my reasoning about the complexity.

Don Stewart
  • 137,316
  • 36
  • 365
  • 468
Oscar Boykin
  • 1,974
  • 2
  • 11
  • 16

4 Answers4

17

As a demonstration of what you can learn from profiling your Haskell code, here's the result of some minor modifications to your code. First, I've replaced mylist with [0..10000000] to make sure it takes a while to compute the maximum.

Here's some lines from the profiling output, after running that version:

COST CENTRE                    MODULE               %time %alloc

ismaxl                         Main                  55.8    0.0
main                           Main                  44.2  100.0

                                                         individual    inherited
COST CENTRE              MODULE         no.    entries  %time %alloc   %time %alloc

MAIN                     MAIN            1           0   0.0    0.0   100.0  100.0
 CAF:main_c5             Main          225           1   0.0    0.0    15.6    0.0
  main                   Main          249           0   0.0    0.0    15.6    0.0
   ismaxl                Main          250           1  15.6    0.0    15.6    0.0
 CAF:main_c3             Main          224           1   0.0    0.0    15.6    0.0
  main                   Main          246           0   0.0    0.0    15.6    0.0
   ismaxl                Main          247           1  15.6    0.0    15.6    0.0
 CAF:main_c2             Main          223           1   0.0    0.0    14.3    0.0
  main                   Main          243           0   0.0    0.0    14.3    0.0
   ismaxl                Main          244           1  14.3    0.0    14.3    0.0
 CAF:main_c1             Main          222           1   0.0    0.0    10.4    0.0
  main                   Main          239           0   0.0    0.0    10.4    0.0
   ismaxl                Main          240           1  10.4    0.0    10.4    0.0
 CAF:main8               Main          221           1   0.0    0.0    44.2  100.0
  main                   Main          241           0  44.2  100.0    44.2  100.0

It's pretty obviously recomputing the maximum here.

Now, replacing ismaxl with this:

ismaxl :: (Ord a) => [a] -> a -> Bool
ismaxl l = let maxel = maximum l in (== maxel)

...and profiling again:

COST CENTRE                    MODULE               %time %alloc

main                           Main                  60.5  100.0
ismaxl                         Main                  39.5    0.0

                                                         individual    inherited
COST CENTRE              MODULE         no.    entries  %time %alloc   %time %alloc

MAIN                     MAIN            1           0   0.0    0.0   100.0  100.0
 CAF:main_c5             Main          227           1   0.0    0.0     0.0    0.0
  main                   Main          252           0   0.0    0.0     0.0    0.0
   ismaxl                Main          253           1   0.0    0.0     0.0    0.0
 CAF:main_c3             Main          226           1   0.0    0.0     0.0    0.0
  main                   Main          249           0   0.0    0.0     0.0    0.0
   ismaxl                Main          250           1   0.0    0.0     0.0    0.0
 CAF:main_c2             Main          225           1   0.0    0.0     0.0    0.0
  main                   Main          246           0   0.0    0.0     0.0    0.0
   ismaxl                Main          247           1   0.0    0.0     0.0    0.0
 CAF:main_c1             Main          224           1   0.0    0.0     0.0    0.0
 CAF:main_ismax          Main          223           1   0.0    0.0    39.5    0.0
  main                   Main          242           0   0.0    0.0    39.5    0.0
   ismaxl                Main          243           2  39.5    0.0    39.5    0.0
 CAF:main8               Main          222           1   0.0    0.0    60.5  100.0
  main                   Main          244           0  60.5  100.0    60.5  100.0

...this time it's spending most of its time in one single call to ismaxl, the others being too fast to even notice, so it must be computing the maximum only once here.

C. A. McCann
  • 76,893
  • 19
  • 209
  • 302
11

Here's a modified version of your code that will allow you to see whether or not maxel is reused:

import Debug.Trace

ismaxl :: (Ord a) => [a] -> a -> Bool
ismaxl l x = x == maxel
           where maxel = trace "Hello" $ maximum l

main = do
  let mylist = [1, 2, 3, 5]
  let ismax = ismaxl mylist
  --Is each call O(1)?  Does each call remember maxel?
  let c1 = ismax 1
  let c2 = ismax 2
  let c3 = ismax 3
  let c5 = ismax 5
  putStrLn (show [c1, c2, c3, c5])

You'll see that maxel is not 'remembered' between applications.

In general, you shouldn't expect Haskell to start doing reductions until all of the arguments have been supplied to a function.

On the other hand, if you have aggressive optimisation turned on then it's hard to predict what a particular compiler would actually do. But you probably ought not to rely on any part of the compiler that's hard to predict when you can easily rewrite the code to make what you want explicit.

sigfpe
  • 7,996
  • 2
  • 27
  • 48
6

Building off other good answers, GHC hasn't been eager to perform this sort of optimization in my experience. If I can't easily make something point-free, I've often resorted to writing with a mix of bound vars on the LHS and a lambda:

ismaxl :: (Ord a) => [a] -> a -> Bool
ismaxl l = \x -> x == maxel
           where maxel = maximum l

I don't particularly like this style, but it does ensure that maxel is shared between calls to a partially applied ismaxl.

John L
  • 27,937
  • 4
  • 73
  • 88
  • 3
    To be even more explicit: `ismaxl l = let maxel = maximum l in \x -> x == maxel`. To the compiler it's more or less the same, but to my eye it seems a bit more obvious that the "let" is outside the lambda. – mokus Nov 12 '10 at 20:25
1

I haven't been able to find any such requirement in the Haskell Report, and in fact GHC doesn't seem to perform this optimization by default.

I changed your main function to

main = do
  let mylist = [1..99999]
  let ismax = ismaxl mylist
  let c1 = ismax 1
  let c2 = ismax 2
  let c3 = ismax 3
  let c5 = ismax 5
  putStrLn (show [c1, c2, c3, c5])

Simple profiling shows (on my old Pentium 4):

$ ghc a.hs
$ time ./a.out 
[False,False,False,False]

real    0m0.313s
user    0m0.220s
sys     0m0.044s

But when I change the definition of c2, c3 and c5 to let c2 = 2 == 99999 etc. (leaving c1 as it is), I get

$ ghc a.hs
$ time ./a.out 
[False,False,False,False]

real    0m0.113s
user    0m0.060s
sys     0m0.028s
Fred Foo
  • 355,277
  • 75
  • 744
  • 836
  • 2
    The behavior everyone is talking about is not in the Haskell report. A Haskell implementation that used call by name instead of laziness (duplicating work on substitution) would be conformant. But nobody would use it because it would be too slow :-P – luqui Nov 12 '10 at 20:22