0

I encounter low memory issues on proxmox 7 nodes I'm managing. When reading about similar problems I was directed to linuxatemyram.com, after reading this page I started monitoring "available" memory instead of "used" memory. But the problem persisted (available memory decreases with uptime).

I then found that I could force free linux caches by issuing the command echo 3 > /proc/sys/vm/drop_caches. I was expecting to see "used" memory becoming free, but I didn't expect "available" memory to increase because as far as I understand, available memory is also concidered used because it is used by linux for caching.

But "available" memory increased after drop_caches, as you can see bellow :

root@proxmox13:~$ free -h
               total        used        free      shared  buff/cache   available
Mem:            31Gi        29Gi       1.1Gi        67Mi       258Mi       977Mi
Swap:             0B          0B          0B
root@proxmox13:~$ echo 2 > /proc/sys/vm/drop_caches
root@proxmox13:~$ free -h
               total        used        free      shared  buff/cache   available
Mem:            31Gi        26Gi       4.1Gi        67Mi       205Mi       3.9Gi
Swap:             0B          0B          0B

Why did it increase ? Why wasn't the freed memory concidered available before if it was used for caching ?

Thanks for your help.

Teriblus
  • 71
  • 5

1 Answers1

2

At a high level, available is free plus caches and other easy to reclaim things, for the convenience of humans. https://www.linuxatemyram.com/ uses available in an attempt to explain what is going on. Other counters exist for various caches that exclude free.

In reality, the Linux VMM is complicated and messy. Rarely does memory use add up exactly with simple accounting. I think Cached in /proc/meminfo means page cache but you also dropped dentries and inodes. So buff/cache in free did not change a lot. Try the slabtop if you ever need to dig into kernel objects in detail.

One GB available out of 32 is not a lot from a capacity planning perspective. Consider reducing the number of guests per VM host, or increasing physical memory.

Do not use /proc/sys/vm/drop_caches which is likely to hurt performance due to the work to drop the caches, and to re-read data from disk. This is for cold storage performance testing, when people feel too lazy to reboot the host.

Speaking of reboots, programs do not have to leak memory for available to slowly decrease. VM hosts and the guests inside are running probably thousands of tasks, some of which stay running and keep various memory allocations. You should be rebooting every few months for software updates, so as long as the "leak" is slow it might not be worth investigating in detail.

Improve your memory monitoring by also looking at pressure stall information. The metric I actually care about is if tasks are stalling for lack of memory, and PSI tracks that.

John Mahowald
  • 32,050
  • 2
  • 19
  • 34