10

Is there any recommend guidelines when to use strictness in Haskell ?

For example, I was looking on the tagsoup library. They have one of their data structure defined like this:

data Tag str    
    = TagOpen str [Attribute str]
    | TagClose str
    | TagText str
    | TagComment str
    | TagWarning str
    | TagPosition !Row !Column 

type Row = Int 
type Column = Int

So on what factor exactly do they decide that TagPosition should be strict ? Are there any recommend guidelines for this ?

Sibi
  • 47,472
  • 16
  • 95
  • 163
  • 1
    A list is non-strict and represents a stream (it makes sense to evaluate its elements as needed). A Int represents a number. It could still make sense to evaluate it lazily (e.g., it might represent a length of a stream that you don't really need). In short: it depends on the application. – d8d0d65b3f7cf42 Jan 22 '14 at 09:49
  • @d8d0d65b3f7cf42 Actually, if we're talking about something like Int, *would* it be faster to be lazy if you never used the value? Numeric operations are very fast and I'm not sure how that compares to the the speed of making a thunk. That's also not taking into consideration the memory cost. – David Young Jan 22 '14 at 18:23
  • @DavidYoung As I'm saying, it's unlikely. But if you pose this as a challenge, I'm sure it's possible to come up with a scenario where it's faster. It's always possible to have an `Int` field on which you have few, but extremely costly computations, and that you rarely need. Then a lazy `Int` would be faster. – kosmikus Jan 22 '14 at 19:34

1 Answers1

8

For simple, unstructured datatypes such as Int or Double, turning them into strict fields is often a good default. That makes their space consumption very predictable (and constant). While it's possible that performance degrades due to performing unnecessary computations, this is, in general unlikely. For example, keeping track of a position is usually extremely simple and inexpensive, so there's nothing to be afraid of in terms of performance, and having predictable space behaviour is far more important.

An additional advantages of making simple types strict is that they can often be unpacked, i.e., stored directly within the constructor instead of via an additional indirection (there are pragmas or compiler flags for it). For small types, this usually is an advantage.

For structured datatypes such as lists or trees, the situation is far more complicated. A simple ! will rarely help here, because it only forces to WHNF. An evaluated list or tree can also easily be more costly in terms of space than an unevaluated thunk. Nevertheless, it sometimes makes sense to make such data strict as well. In such cases, you usually would wrap the constructor using a function (a so-called smart constructor) that establishes strictness invariants by calling deepseq in appropriate places.

kosmikus
  • 19,549
  • 3
  • 51
  • 66