5

I am getting a warning in Redis Log against latency issues as below :

WARNING you have Transparent Huge Pages (THP) support enabled in your kernel.
To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root

What are the side-effects / cons of disabling Transparent Huge Pages (THP) ?

As in kernel by default has it enabled.

osgx
  • 90,338
  • 53
  • 357
  • 513
Kuldeep Dangi
  • 4,126
  • 5
  • 33
  • 56
  • 1
    `jemalloc` has problems with THP and `madvise MADV_DONTNEED` as posted by LuFFy, but there is also possible theoretical [false-sharing](http://en.wikipedia.org/wiki/False_sharing)-like problem in NUMA machines: https://lkml.org/lkml/2016/2/25/623 "*old problem whereby there can be false sharing of NUMA pages within a THP boundary. Consider for example if threads are calculating 4K blocks and then it gets migrated as a THP including unrelated threads.*" – osgx Mar 03 '17 at 06:31
  • that error also says "and add it to your /etc/rc.local in order to retain the setting after a reboot" how do you do that? – Agent Zebra Mar 15 '19 at 23:34
  • 1
    Sorry to post in a six year old question, but is everyone missing the point? The question asks what are the CONS of disabling THP. All answers point to why you should do it or how to do it. No one has addressed the downsides of turning it off. If I'm off base, please let me know and I'll delete this comment. I've also flagged the first post as condescending because the documentation does NOT address possible reasons NOT to disable THP. – brunson Feb 01 '23 at 23:34

1 Answers1

9

As Per digitalocean's Article : Transparent Huge Pages and Alternative Memory Allocators: A Cautionary Tale

Recently, our site reliability engineering team started getting alerted about memory pressure on some of our Redis instances which have very small working sets.1 As we started digging into the issue, it became clear that there were problems with freeing memory after initial allocation because there were a relatively small number of keys but a comparatively large amount of memory allocated by redis-server processes. Despite initially looking like a leak, the problem was actually an issue between an alternative memory allocator and transparent huge pages.

Why the need of Disabling THP required ?

This rabbit hole began when a redis-server process, which had recently been moved over to LD_PRELOAD jemalloc.so, began using significant amounts of memory. Initial signs pointed to the fact that using an alternative allocator might be part of the issue, so that's where we started digging.

It turns out that jemalloc(3) uses madvise(2) extensively to notify the operating system that it's done with a range of memory which it had previously malloc'ed. Because the machine used transparent huge pages, the page size was 2MB. As such, a lot of the memory which was being marked with madvise(..., MADV_DONTNEED) was within ranges substantially smaller than 2MB. This meant that the operating system never was able to evict pages which had ranges marked as MADV_DONTNEED because the entire page would have to be unneeded to allow it to be reused.

So despite initially looking like a leak, the operating system itself was unable to free memory because of madvise(2) and transparent huge pages. This led to sustained memory pressure on the machine and redis-server eventually getting OOM killed.

As Per Redis Latency Problems Troubleshooting Documentation :

Transparent huge pages must be disabled from your kernel. Use echo never > /sys/kernel/mm/transparent_hugepage/enabled to disable them, and restart your Redis process.

Latency induced by transparent huge pages

Unfortunately when a Linux kernel has transparent huge pages enabled, Redis incurs to a big latency penalty after the fork call is used in order to persist on disk. Huge pages are the cause of the following issue:

  1. Fork is called, two processes with shared huge pages are created.
  2. In a busy instance, a few event loops runs will cause commands to target a few thousand of pages, causing the copy on write of almost the whole process memory.
  3. This will result in big latency and big memory usage.
CL.
  • 173,858
  • 17
  • 217
  • 259
LuFFy
  • 8,799
  • 10
  • 41
  • 59
  • 1
    This only addresses the possible benefits of disabling THP, not the reasons you might not want to. It does not answer the question asked. – brunson Feb 01 '23 at 23:36