NUMA page_migration performance issue

Question

I'm running java application with ~80GB of memory-mapped files, which should be accessible via TCP, using AWS r3.8xlarge for that (I have it reserved, so migrating to non-NUMA architecture is not an option, at least right now) and have the following problem:

during peak loads I noticed increased sys time and perf tool is showing that most of the time is spent in numa page_migration. Is it possible to replicate somehow memory across NUMA blocks to prevent continuous moving pages from one block to another? (all CPUs accessing all ~80G of memory)

(whole memory is read-only and may be pre-allocated during application startup.)

Why does you application need to access all the data randomly? Can you not have N services with 1/N of the data each running it its own numa region? — Peter Lawrey, Mar 01 '16 at 08:18
to keep application logic simple - it's like a big read-only database with random lookups. And each request may need data from different parts as well. — Stanislav Levental, Mar 01 '16 at 08:23
You are right that it would be better if the OS just knew to keep multiple copies, but what if you kept copies on disk. — Peter Lawrey, Mar 01 '16 at 08:26
I was thinking about running 2 java processes with privately mmaped files on each numa block - but it's little bit hard to support; so I'm wondering if it's an only possibility of getting replica in each numa block? — Stanislav Levental, Mar 01 '16 at 08:30
Are Transparent huge pages (THP) enabled? if so, try with THP disabled. I had seen heavy memory applications recommending to disable THP, for better performance. `grep -i "^AnonHuge" /proc/meminfo` will show if they are in use — VenkatC, Mar 02 '16 at 02:34

NUMA page_migration performance issue

0 Answers0