0

I run MongoDB 4.0 on WiredTiger under Ubuntu Server 16.04 to store complex documents. There is an issue with one of the collections: the documents have many images written as strings in base64. I understand this is a bad practice, but I need some time to fix it.

Because of this some find operations fail, but only those which have a non-empty filter or skip. For example db.collection('collection').find({}) runs OK while db.collection('collection').find({category: 1}) just closes connection after a timeout. It doesn't matter how many documents should be returned: if there's a filter, the error will pop every time (even if it should return 0 docs), while an empty query always executes well until skip is too big.

UPD: some skip values make queries to fail. db.collection('collection').find({}).skip(5000).limit(1) runs well, db.collection('collection').find({}).skip(9000).limit(1) takes way much time but executes too, while db.collection('collection').find({}).skip(10000).limit(1) fails every time. Looks like there's some kind of buffer where the DB stores query related data and on the 10000 docs it runs out of the resources. The collection itself has ~10500 docs. Also, searching by _id runs OK. Unfortunately, I have no opportunity to make new indexes because the operation fails just like read.

What temporary solution I may use before removing base64 images from the collection?

JustLogin
  • 1,822
  • 5
  • 28
  • 50
  • Try running one of the queries with `explain` to see what the planner is doing, and check the mongod log for errors. – Joe Oct 27 '21 at 16:06
  • try using aggregate, however I doubt it will give any diff result. `even if it should return 0 docs` - try setting a `limit 1` – dododo Oct 27 '21 at 16:11
  • @Joe If I do this, the query ends with the timeout and the DB reboots after ~30sec. I see no errors but shutdown-start sequence in the logs. – JustLogin Oct 27 '21 at 16:19
  • @dododo same thing – JustLogin Oct 27 '21 at 16:31
  • Then, you may look at the server logs to see the issue (try setting $comment https://docs.mongodb.com/v4.0/reference/operator/query/comment/ to find related logs quicker) – dododo Oct 27 '21 at 16:34

1 Answers1

0

This happens because such a problematic data scheme causes huge RAM usage. The more entities there are in the collection, the more RAM is needed not only to perform well but even to run find.

Increasing MongoDB default RAM usage with storage.wiredTiger.engineConfig.cacheSizeGB config option allowed all the operations to run fine.

JustLogin
  • 1,822
  • 5
  • 28
  • 50
  • This seems like it is only hiding the underlying problem. If the cache gets full, the eviction workers will move in to free up space. Unless the OOM killer was responsible for the service getting killed, there is something else also affecting this. – Joe Oct 28 '21 at 01:17
  • @Joe I think this may be Docker with the Docker-compose add-on. Running many containers on the same machine makes RAM usage not that obvious. – JustLogin Oct 28 '21 at 16:02