13

I saw this comment in Containers/Data/Set/Base.hs

-- [Note: Order of constructors]
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-- The order of constructors of Set matters when considering performance.
-- Currently in GHC 7.0, when type has 2 constructors, a forward conditional
-- jump is made when successfully matching second constructor. Successful match
-- of first constructor results in the forward jump not taken.
-- On GHC 7.0, reordering constructors from Tip | Bin to Bin | Tip
-- improves the benchmark by up to 10% on x86.

Where else does order have tiny measurable impacts on performance? In particular I wonder about case statements with many options.

stkent
  • 19,772
  • 14
  • 85
  • 111
Michael Fox
  • 3,632
  • 1
  • 17
  • 27
  • 2
    fyi, ghc 7.0 was release 4 years ago, so things might have changed since then – ErikR Dec 05 '14 at 20:21
  • @user5402, yeah, but they haven't. This particular bit of code hasn't been touched much in recent history, I don't think. – dfeuer Dec 05 '14 at 20:24

1 Answers1

15

It depends. The order of constructors does, unfortunately, make a difference. This means that the order of the patterns for that type does not. Whether you write

foo (Bin x y) = ...
foo Tip = ...

or

foo Tip = ...
foo (Bin x y) = ...

makes no difference, because they will be reordered by constructor order immediately in the "desugaring" process. The order of matching for multiple patterns is always semantically left-to-right, so the argument order can matter if you use multiple patterns together (you can always get around this with case). But GHC feels very free to reorganize code, sometimes for good and sometimes for ill. For instance, if you write

foo :: Int -> Bool
foo x = x == 5 || x == 0 || x == 7 || x == 4

GHC will smash it into (essentially)

foo = \x -> case x of
              0 -> True
              4 -> True
              5 -> True
              7 -> True
              _ -> False

and then do a sort of binary search of these possibilities. This is probably not optimal in many cases at all, and is especially annoying if you happen to know that x==5 is particularly likely. But that's how it is for now, and changing it will make things perform badly in certain situations without someone doing a lot of work.

dfeuer
  • 48,079
  • 5
  • 63
  • 167
  • 2
    I was under the impression that for function definitions GHC respected the order in which you defined the patterns, since if it decided to rearrange patterns that can overlap it might make your function behave incorrectly. – bheklilr Dec 05 '14 at 20:25
  • Simple constructor patterns like `Bin x y` and `Tip x` obviously can't overlap. Of course, multiple patterns using the same constructor can overlap and can't be reordered. What kind of patterns are you thinking of? – ErikR Dec 05 '14 at 20:29
  • 1
    @bheklilr, GHC respects the *semantics* of the pattern matching, but doesn't respect the actual code structure, so the *performance* can be different. It has special code for equality with literals, and also smashes `case` as I showed. There are all sorts of other things too, like the case-of-case transformation. – dfeuer Dec 05 '14 at 23:13