19

Relating to my other question Haskell collections with guaranteed worst-case bounds for every single operation?, I'm curious: How long pauses can be caused by garbage collection?

Does Haskell use some kind of incremental garbage collection so that a program is stopped only for small periods at a time, or can it stop for several seconds in an extreme case?

I found two SPJ's papers on the subject: https://research.microsoft.com/en-us/um/people/simonpj/papers/non-stop/index.htm. But I didn't find a reference if these ideas were actually adopted by GHC (or other Haskell implementations).

Community
  • 1
  • 1
Petr
  • 62,528
  • 13
  • 153
  • 317
  • 4
    Note that in theory there are many causes of unpredictable unbounded delays in a program's operation. These include things like virtual memory paging, and context switches performed by the OS. These apply to all languages, even those that do not use automatic garbage collection. In theory these can insert delays between any two operations of any program, with *no guarantee* of a maximum delay length. In practice, they are rarely a problem that you have to deal with in "normal" programming. GC delays are the same. – Ben Sep 14 '12 at 00:45

1 Answers1

24

GHC is designed for computational throughput, not latency. As a result, GHC uses a generational, multi-threaded garbage collector with thread-local heaps. Garbage collection of thread-local objects does not stop other threads. The occasional major GC of the global heap will pause all threads.

Typically the pauses are in the small number of milliseconds, however there is no guarantee of latency.

You can control the frequency of GC via several runtime flags (e.g. the gc -I interval).

Pang
  • 9,564
  • 146
  • 81
  • 122
Don Stewart
  • 137,316
  • 36
  • 365
  • 468
  • 3
    I'm pretty sure the thread-local heaps stuff isn't actually in the mainline. Simon Marlow said it made the GC much more complicated while not improving GC performance that much. So, the revised answer is: all GCs cause pauses (usually pretty short) and major GCs are performed in parallel. The actual pause times depend very much on the heap size. `+RTS -s` actually prints the pause times, so it's easy to find out. – nominolo Sep 22 '12 at 19:26