Getting consistent checksum errors and crashing in MongoDB whenever the size of the uncompressed data is more than the cache.
Some information:
Computer: (standalone mongodb)
CPU AMD 3600X
Memory 32*4 = 128GB
Storage 2TB SSD
Method Board B450MSystem:
Linux 5.4.0-42-generic #46~18.04.1-Ubuntu SMP x86_64 x86_64 x86_64 GNU/Linux
MongoDB data size:
Uncompressed data size 42GB (zstd)
Compressed data size 12GB
Index size 1.6GB
Average object size 1.4KB
My question is regarding what could be causing this crash. So far what I know is:
-If I set cacheSizeGB to 32GB (less than 42GB, the uncompressed data size), mongodb crashes with a checksum error nearly every day. Crashes mostly occur during mongodump, sometimes occurs when updating data.
-But if I set cacheSizeGB to 100GB (more than 42GB), no crash happens.
Additional Information:
- All objects update everyday and run mongodump for backup everyday.
- System memory buffer/cache grow with time until all free memory is used.
- After getting a checksum error, I will run the repair command.
Error Message:
2020-08-07T10:51:39.094+0800 E STORAGE [conn399] WiredTiger error (0) [1596768699:94163][45915:0x7f32835f7700], file:collection-22--9089965868171986819.wt, WT_CURSOR.search: __wt_block_read_off, 274: collection-22--9089965868171986819.wt: read checksum error for 28672B block at offset 6685696000: calculated block checksum doesn't match expected checksum Raw: [1596768699:94163][45915:0x7f32835f7700], file:collection-22--9089965868171986819.wt, WT_CURSOR.search: __wt_block_read_off, 274: collection-22--9089965868171986819.wt: read checksum error for 28672B block at offset 6685696000: calculated block checksum doesn't match expected checksum
2020-08-07T10:51:39.094+0800 E STORAGE [conn399] WiredTiger error (0) [1596768699:94321][45915:0x7f32835f7700], file:collection-22--9089965868171986819.wt, WT_CURSOR.search: __wt_bm_corrupt_dump, 135: {6685696000, 28672, 0xdae2251d}: (chunk 1 of 28): 00 00 00 00 00 00 00 00 d7 d7 8a 01 00 00 00 00 67 75