0

I have a Debian 9 server and the load is over 4000. "top" claims there are 18,132 processes sleeping.

Initially, I saw many "ps" processes, hundreds, so I killed them all by name.

I don't see any offending processes currently, but the load remains. Further, I cannot do a remote reboot. It claims the system is going down for reboot, but nothing happens. If I open another terminal, it still works.

How do I get rid of these backed up processes, other than having someone reboot on site?

Corepuncher
  • 191
  • 1
  • 3
  • 9

1 Answers1

0

Sleeping tasks are not inflating your load average. Running tasks are, plus on Linux TASK_UNINTERRUPTIBLE which is usually I/O.

A new shell may be tolerably responsive as the scheduler prioritizes interactive users. But do significant work and things are very slow.

Find and stop whatever script or service is starting more of these processes. Then kill tasks. This likely will require your favorite task management scripts (ps, pkill) and patience with an unresponsive system.

Clean reboot will get things responsive, but likely will also take forever to get the CPU time to shut down a few thousand tasks. Get a remote out of band way to power off, either BMC for physical or the hypervisor console for a VM.

Prevention measures include per user task limits. For pam_limits on Linux, nproc lines in /etc/security/limits.d/ files.

John Mahowald
  • 32,050
  • 2
  • 19
  • 34