Excessive amount of system calls when using `threadDelay`

Question

I'm having a couple of Haskell processes running in production on a system with 12 cores. All processes are compiled with -threaded and run with 12 capabilities. One library they all use is resource-pool which keeps a pool of database connection.

What's interesting is that even though all processes are practically idle they consume around 2% CPU time. Inspecting one of these processes with strace -p $(pgrep processname) -f reveals that the process is doing an unreasonable amount of system calls even though it should not really be doing anything. To put things into perspective:

Running strace on a process with -N2 for 5 seconds produces a 66K log file.
Running it with (unreasonable) -N64 yields a 60 Megabyte log.

So the number of capabilities increases the amount of system calls being issued drastically.

Digging deeper we find that resource-pool is running a reaper thread which fires every second to inspect if it can clean up some resources. We can simulate the same behavior with this trivial program.

module Main where

import Control.Concurrent
import Control.Monad (forever)

main :: IO ()
main = forever $ do
  threadDelay (10 ^ 6)

If I pass -B to the runtime system I get audio feedback whenever a GC is issued, which in this case is every 60 seconds.

So when I suppress these GC cycles by passing -I0 to the RTS running the strace command on the process only yields around 70K large log files. Since the process is also running a scotty server, GC is triggered when requests are coming in, so they seem to happen when I actually need them.

Since we are going to increase the amount of Haskell processes on this machine by a large amount during the course of the next year I was wondering how to keep their idle time at a reasonable level. Apparently passing -I0 seems to be a rather bad idea (?). Another idea would be to just decrease the number of capabilities from 12 to maybe something like 4. Is there any other way to tweak the RTS so that I can keep the processes from burning to many CPU cycles while idling?

You may want to see [GHC Trac #11134](https://ghc.haskell.org/trac/ghc/ticket/11134) as it's quite possible that you are seeing idle GCs. — bgamari, Dec 16 '15 at 19:16
Running multiple processes (M) with N capabilities on machine with P cores will make processes fight for resources bad when M * N > P. IIRC there are spinlocks somewhere aggressively eating CPU in that case. — phadej, Jun 07 '17 at 15:48

score 1 · Answer 1 · answered Sep 10 '21 at 09:21

The way GHC's memory management is structured, in order to keep memory usage under control, a 'major GC' is periodically needed, during the running of the program. This is a relatively expensive operation, and it 'stops the world' - the program makes no progress whilst this is occurring.

Obviously, it is undesirable for this to happen at any crucial point of program execution. Therefore by default, whenever a GHC-compiled program goes idle, a major GC is performed. This is usually an unobtrusive way of keeping the garbage level down and overall memory efficiency and performance up, without interrupting program interaction. This is known as 'idle GC'.

However, this can become a problem in scenarios like this: Many concurrent processes, Each of them woken frequently, running for a short amount of time, then going back to idle. This is a common scenario for server processes. In this case, when idle GC kicks in, it doesn't obstruct the process it is running in, which has completed its work, but it does steal resources from other processes running on the system. Since the program frequently idles, it is not necessary for the overhead of a major GC to be incurred on every single idle.

The 'brute force' approach would be to pass the RTS option -I0 to the program, disabling idle GC entirely. This will solve this in the short run, but misses an opportunity to collect garbage. This could allow garbage to accumulate, causing GC to kick in at an inopportune moment.

Partly in response to this question, the flag -Iw was added to the GHC runtime system. This establishes a minimum interval between which idle GCs are allowed to run. For example, -Iw5 will not run idle GC until 5 seconds have elapsed since the last GC, even if the program idles several times. This should solve the problem.

Just keep in mind the caveat in the GHC User's Guide:

This is an experimental feature, please let us know if it causes problems and/or could benefit from further tuning.

Happy Haskelling!

Excessive amount of system calls when using `threadDelay`

1 Answers1