0

I have a number of servers that we use as KVM hypervisor nodes. We want to enable hugepages and leverage this features for performance related matters.

I have looked online on how to enable hugepages, and that is quite clear and simple, what I cant find though is how to determine the hugepages count value that should be used.

To give you some perspective, this is the system we have (same across the entire cluster):

$ free -g
              total        used        free      shared  buff/cache   available
Mem:            503           5         497           0           1         495
Swap:             0           0           0

We want to enable hugepages of 1GB in size, but the HugePages count is what we dont know how to define. How is this number determined, is it memory based, any input would be appreciated.

Config in question:

GRUB_CMDLINE_LINUX="$GRUB_CMDLINE_LINUX hugepagesz=1G hugepages=<what goes here> transparent_hugepage=never"

1 Answers1

0

For testing, fully load one of these hypervisor hosts with a typical workload, on small pages. So, don't set hugepages to start.

Capture a few samples of /proc/meminfo. Watch if PageTables: becomes a too large fraction of memory, with heavy system CPU, that's overhead. Overhead might not get intolerable, but 500 GB is a lot and huge pages should provide a benefit.

DirectMap1G: is what the kernel is tracking with 1GB maps, in kB. Not necessarily explicit huge pages, just contiguous the kernel was able to find. Divide value from experimentation by 1048576 to get a starting value for gigantic pages.

Record DirectMap2M: too, these are 2MB maps, in case that huge page size works better.

Huge page allocations cannot be 100% of a host's memory, there are 4k only uses, and VM hosts should not overcommit guest memory. Approximately, at least a few percent for kernel, caches, administrative programs, and a little free. Most of the remainder could work for huge pages.

RHEL docs suggest an early boot script to poke /sys/devices/system/node/node*/hugepages/hugepages-1048576kB/nr_hugepages with the number of pages. Note that if you have multiple NUMA nodes you would want to tweak the script to poke a value into each.

hugepages on the kernel command line remains an option instead, but you'd have to maintain numbers specific to a host's memory sizing with your bootloader config.

Speaking of bootloader config, I agree with transparent_hugepage=never. THP incurs significant overhead defragmenting pages and shuffling them around. Undesirable on large memory boxes with a known workload.

John Mahowald
  • 32,050
  • 2
  • 19
  • 34