8

I have a rather old Linux machine with 2GB of ram, no swap, and it's working very well, with the system using every unused piece of memory for caching with great effect.

However, when I'm close to stressing out memory (e.g., >1950MB allocated), it slows to a crawl; I suspect that's because there are no disk buffers left. I know that the OOM killer would soon go into effect, but it doesn't usually get there -- it's becoming so slow that loads shoots to 30-40, no process makes any progress (thus doesn't allocate more memory), and I have to restart it.

When I try to just kill one process to get the machine to respond, e.g. by going to the console (via Alt-F1, logging in, and just doing a "killall badprocess"), it usually works, except that I have to wait ~10 minutes between user/password and getting a prompt -- all while there is disk activity.

Again, there's no swap, so it isn't swapping -- it's just thrashing because it has no buffers left.

I would much have 100MB or so dedicated exclusively to disk buffers, which would trigger the OOM killer earlier (less memory for programs, after all) but on the other hand would leave the machine responsive at all times.

Is there a way to do that? I haven't been able to find a /proc/kernel or /sys/vm entry that does this kind of thing.

HopelessN00b
  • 53,795
  • 33
  • 135
  • 209

2 Answers2

1

Have a look at /proc/sys/vm/min_free_kbytes. It's the limit of free kbytes that triggers the oom-killer. Also it would be good to check at the logs for the keyword oom-killer in order to know what is being killed {propably you don't want to kill ssh, you it's better to renice it}

Nikolaidis Fotis
  • 2,032
  • 11
  • 13
  • Thanks. I enlarged it, but that doesn't seem to solve the problem -- once physical memory was close to exhaustion, there was no buffer memory left, and the machine slowed to a crawl. –  Aug 18 '10 at 03:51
  • No help here either, system still goes completely unresponsive. – Tronic Nov 23 '10 at 11:47
  • This actually helped me, I also have 2GB ram and I set this to almost 500MB - for now no slowdowns/hangups – Krišjānis Nesenbergs May 24 '11 at 22:25
  • I am currently testing this setting on my workstation. I have 8 GB RAM and most of the time I don't use more than 5... except when for some reason I have to fire up a Windows VM that requires about 4 GB RAM. I have ZRAM set up on my host OS becuse my hard drive is mechanical, but it still becomes quite slow with the RAM almost full due precisely to low RAM space for filesystem buffers and caches. I just used vm.min_free_kbytes to make sure I always have at least 2 GB free and have the rest paged to zipped RAM (which is *way* faster than normal swap space). Will post later with results. – RAKK Jun 08 '17 at 18:05
1

Waiting for the oom-killer to free up memory is a bit like waiting for the engine to stop on your car to tell you when it's time to fill up your gas tank. The oom-killer is a heavy-handed tool of a last resort and desperation for a resource-starved machine. It kills the next program it touches with no consideration for how this will affect your application, reachability, reliability, and so forth. When the oom-killer is invoked, your server is gasping for breath and in critical condition.

Instead, you're much better off taking an active approach at managing your memory usage within your application environment. You can monitor /proc/meminfo for trouble and take appropriate action and throttle back on your workload before a serious situation gets ugly.

tylerl
  • 15,055
  • 7
  • 51
  • 72
  • The situation I discovered is *exactly* the time at which my server is gasping for breath and in critical condition. It takes less than 20 seconds from a fully responsive machine to taking 1 minute to respond to Ctrl-Alt-F1 (switch from X to console). And log-in is impossible because it times out after 1 minute without even asking for a password. This is a machine that has many processes running; each one is independently NOT the problem. Also, this is strictly a memory issue -- CPU is fine, and disk is fine as long as there's about 50MB of disk buffers left. –  Aug 18 '10 at 03:50
  • what if you use ulimit and if an application uses over a threshold to take an action ? – Nikolaidis Fotis Aug 18 '10 at 09:45
  • The problem is the sum of all applications; 20 or so are running, each with 20-100 MB allocated. It works fine for weeks, even months, but when they all want to have ~100MB allocated at the same time, everything crashes and burns; I'd rather have oom_killer kill one of them than myself having to reboot the machine. Anyway, I've turned on swap for now -- most apps don't use all their memory all the time, so the machine remains stable even when stressed to the end of physical memory; however, I would rather have no swap at all for this machine, if I can. –  Aug 19 '10 at 16:36
  • 1
    Does not solve the actual problem which is a combination of not being to set proper memory usage limits (ulimits are not very useful), applications easily going havoc with memory allocations, the OOM killer failing to fire early enough and the massive disk trashing and unresponsiveness caused by all that. I just wasted 30 minutes of my employer's time because the development machine would trash the disk for half an hour while compiling my code, instead of simply killing the Chromium processes it needed to kill (or the compile itself) in less than a second and then be done with it. – Tronic Nov 23 '10 at 10:38
  • If you set `oom_adj` correctly you can have your desktop system working a bit like Android where system is practically always running against OOM killer (technically there's an "low memory killer" and it's tuned via `/sys/module/lowmemorykiller`). The logic is to continously mark non-critical background processes as potential victims for OOM killer and looking for killed processes and slowly re-starting required killed programs to avoid overburdening the system. Just make sure that the process that keep re-launching other processes is marked out of the limits for OOM killer. – Mikko Rantalainen Jan 07 '18 at 17:38