I am trying to compare mongodb ( latest from git repo) compression rates of snappy, zstd, etc. Here is relevant snip from my my /etc/mongod.conf
storage:
dbPath: /var/lib/mongodb
journal:
enabled: true
wiredTiger:
engineConfig:
journalCompressor: snappy
collectionConfig:
blockCompressor: snappy
indexConfig:
prefixCompression: true
My test case inserts entries into a collection. Each db entry has an _id and 1MB of binary. The binary is randomly generated using faker. I input 5GB/7GB of data but the storage size does not appear to be compressed. The AWS instance hosting the monodb has 15GB of mem and 100GB disk space. Here is what I see for example data collected from dbstat:
5GB data:
{'Data Size': 5243170000.0,
'Index Size': 495616.0,
'Storage size': 5265686528.0,
'Total Size': 5266182144.0}
7GB data:
{'Data Size': 7340438000.0,
'Index Size': 692224.0,
'Storage size': 7294259200.0,
'Total Size': 7294951424.0}
Is there something wrong with my config? or does compression not kick in until the data size is substantially larger than memory size? Or available storage size? What am I missing here?
Thanks a ton for you help.