0

I am running MongoDB 4.x on Linux in a clustered setup (3 replicas), and sometimes memory inexplicably spikes (>10% sudden memory consumption increase on a 64GB RAM machine by the mongod process) and doesn't go back down for sometimes hours. Sometimes this happens multiple times over a short period, resulting in swap being consumed and ultimately slowing down the whole database, affecting replication lag and causing general cluster instability.

The workload of the DB is fairly high - average CPU load of 50-80% on an 8-core machine, and average memory consumption of 70% of 64GB. Workload is a mix of high speed writes and batched reads. I try to direct all heavy reads to the secondaries so that the primary can focus on writes, but sometimes large reads hit the primary too.

During the spike, performing a db.currentOp() doesn't reveal anything taking long, though some queries that should not take long (simple find() on a tiny collection) can take seconds around the time of these spikes.

What can I do to see what queries are consuming all this memory suddenly? I have been looking for slow queries, but I feel like this is an (innacurate) proxy to find what is consuming so much memory.

Lorenz
  • 1
  • Can you see the number of queries active at the same time? Even simple queries can have an impact if there are many of them. I once had a student who use the asynchronounnous behavior of JavaScript the wrong way which resulted in 10k queries beeing sent to the server in a second. Some times because of this mongo was killed by OOM... – Robert Jul 13 '22 at 21:08
  • There is an increase during the spike, yes, but I can only see the number of connections and instead of 100 active connections it may be 150. I don't think that would be enough to justify such a spike? – Lorenz Jul 14 '22 at 18:20

0 Answers0