8

Consider this block of code:

isPrime primes' n = foldr (\p r -> p * p > n || (n `rem` p /= 0 && r)) True primes'

primes = 2 : filter (isPrime primes) [3..]

main = putStrLn $ show $ sum $ takeWhile (< 1000000) primes

which calculates the sum of all primes below one million. It takes 0.468 seconds to print the result on my machine. But if the definitions of isPrime and primes are extracted into another module, the time cost is 1.23 sec, it's almost 3x slower.

Of course I can copy/paste the difinitions everywhere it's required, but I'm also curious about why this is happening, and how to solve it.


[Edit] I'm using GHC 7.0.3 (Windows 7 + MinGW). The code is written in EclipseFP (It uses Scion as IDE back-end), and built into an executable file with -O2 flags.

I also tried building the package outside the IDE:

executable test
  hs-source-dirs:  src
  main-is:         Main.hs
  build-depends:   base >= 4
  ghc-options:     -O2
  other-modules:   Primes

executable test2
  hs-source-dirs:  src2
  main-is:         Main.hs
  build-depends:   base >= 4
  ghc-options:     -O2

Here's the result:

$ time test/test
37550402023

real    0m1.296s
user    0m0.000s
sys     0m0.031s

$ time test2/test2
37550402023

real    0m0.520s
user    0m0.015s
sys     0m0.015s
claude
  • 237
  • 1
  • 8

2 Answers2

7

I can reproduce this if I put isPrime and primes in different modules. (If they are in the same module, but still separate from main, I see no difference).

Adding {-# INLINE isPrime #-} gives back the same performance as having all three in one module, so it would appear that GHC needed a nudge to do cross-module inlining in this case.

This is on GHC 7.0.2, Ubuntu 11.04, 64-bit

hammar
  • 138,522
  • 17
  • 304
  • 385
  • 5
    GHC will perform very aggressive inlining within a module, especially if the function to be inlined is not exported. It's much less eager to inline functions across module boundaries, unless you INLINE them manually. – John L Sep 13 '11 at 16:41
1

Are you running this inside GHCi or compiling via GHC? I just tried an experiment, keeping all the definitions in the same file, moving the first two out, and compiling via GHC with the -O flag on and off. There is no perceptible difference between the differing combinations on my machine (all run just a few milliseconds over 1 second, using GHC 7).

Dominic Mulligan
  • 456
  • 2
  • 10
  • Do you use `-O` or `-O2`? IMHO many optimizations that may be affected by code motion are triggered by the second flag. – fuz Sep 13 '11 at 12:30
  • Build environment infomation added to the original post, thanks! – claude Sep 13 '11 at 12:34
  • @FUZxxl I actually tried with both. No perceptible difference in either case. The overall fastest execution was with no optimisation flags passed to GHC, but we're talking about an overall spread of about 100ms in execution time between all the cobminations on my machine. – Dominic Mulligan Sep 13 '11 at 12:34