MongoDB 2.46 & 2.4.8
Use case:
- Load up 100.000 documents on a collection with 2 indexes. Resident memory increases (mongostat), and no page faults happen.
- Restart mongod. Resident memory is low (this is expected)
- Try to 'preheat' mongo, with touch command db.runCommand({ touch: collection, data: true, index: true }) or other means (on OS, vmtouch / dd)
a) On this step, on my development machine (MacOS), I see in mongostat a lot of page faults trying to heat it up (expected) and the resident memory is raised. From that point on, any updates do not raise page faults
b) On a numa server (256 GB RAM), even though I started up mongo with this guide: http://docs.mongodb.org/manual/administration/production-notes/#mongodb-on-numa-hardware (note: I do not have superuser access. However, the 2nd step, echoing 0 in /proc/sys/vm/zone_reclaim_mode, is already 0 so I left it like that), I cannot seem to be able to pre-heat the memory with the 'touch' command. Nothing happens, even though it returns successfully. In mongostat, only 'mapped' and 'vsize' is getting higher, and resident memory is the same (35m). I even tried to load up the data files in OS's memory with vmtouch and dd commands. Only re-indexing the collection changed the resident memory.
The problem started a while after I began to load up data into the server. I do a lot of upserts and the performance was awesome in the beginning (3000 - 4000 upserts/sec). This was expected because the working set would be able to fit in memory. After 30.000.000 documents the process seems to make a lot of page faults and I do not know why. The data files are approx. 33GB and the performance is about 500 upserts/sec, with a lot of page faults. That should mean that the working set is not in memory. However, 256GB RAM should be more than enough. I tried the 'touch' command, but resident memory was low (I even restarted the mongod process, ran the touch command, and even though 'mapped' and 'vsize' skyrocketed to a lot of GB, resident memory kept low, 35m). I tried to reIndex the collection and voilà, resident memory went from 35m -> 20GB. However, again, I saw page faults. Then I tried to vmtouch the data files (or with dd). Again, a lot of page faults.
The problem is that I cannot have 'only' 500 upserts/sec. Should I change my application logic? I thought with 256GB memory my 'active' working set (expected 60GB) should fit in memory. I am in the middle (30GB) and it seems that I cannot do anything to fix this. Is it the numa hardware? Should I make any other changes?
Thanks in advance