I have a Database with
On disk size 19.032GB
(using show dbs
command)
Data size 56 GB
(using db.collectionName.stats(1024*1024*1024).size
command)
While taking mongodump using command mongodump we can set param --gzip. These are the observations I have with and without this flag.
command | timeTaken in dump | size of dump | restoration time | observation |
---|---|---|---|---|
with gzip | 30 min | 7.5 GB | 20 min | in mongostat the insert rate was ranging from 30K to 80k par sec |
without gzip | 10 min | 57 GB | 50 min | in mongostat the insert rate was very erratic, and ranging from 8k to 20k par sec |
Dump was taken from machine with 8 core, 40 GB ram(Machine B) to 12 core, 48GB ram machine (Machine A). And restored to 12 core, 48 gb machine(Machine C) from Machine A to make sure there is no resource contention between mongo, mongorestore and mongodump process. Mongo version 4.2.0
I have few questions like
- What is the functional difference between 2 dumps?
- Can the bson dump be zipped to make it zip?
- how does number of indexes impact the mongodump and restore process. (If we drop some unique indexes and then recreate it, will it expedite total dump and restore time? considering while doing insert mongodb will not have to take care of uniqueness part)
- Is there a way to make overall process faster. From these result I see that have we have to choose 1 between dump and restore speed.
- Will having a bigger machine(RAM) which reads the dump and restores it expedite the overall process?
- Will smaller dump help in overall time?
Update: 2. Can the bson dump be zipped to make it zip?
yes
% ./mongodump -d=test
2022-11-16T21:02:24.100+0530 writing test.test to dump/test/test.bson
2022-11-16T21:02:24.119+0530 done dumping test.test (10000 documents)
% gzip dump/test/test.bson
% ./mongorestore --db=test8 --gzip dump/test/test.bson.gz
2022-11-16T21:02:51.076+0530 The --db and --collection flags are deprecated for this use-case; please use --nsInclude instead, i.e. with --nsInclude=${DATABASE}.${COLLECTION}
2022-11-16T21:02:51.077+0530 checking for collection data in dump/test/test.bson.gz
2022-11-16T21:02:51.184+0530 restoring test8.test from dump/test/test.bson.gz
2022-11-16T21:02:51.337+0530 finished restoring test8.test (10000 documents, 0 failures)
2022-11-16T21:02:51.337+0530 10000 document(s) restored successfully. 0 document(s) failed to restore.