My company is in the process of migrating data from one kubernetes cluster to another one.
Part of the migration is to move data from mongoDB.
The mongoDB installations came with some backup scripts, which I used as an entrypoint for my custom restore.
What I successfully did (at least as far as I can tell right now) is to run a mongodump
on the old cluster and pipe it into a mongorestore
in the new cluster.
It works, but it is really really slow. The dataset (/data/db
) is around 65G big. The restore has been running for the last 6 hours or so and is barely moving forward.
Also, at some point the process was interrupted and instead of deleting all data I simply started to script again - thinking it would still apply everything and throw errors for duplicate keys which I can ignore.
This is what I precisely do
kubectl --kubeconfig=old-cluster.conf exec -t $SOURCE_MONGO_POD -- \
bash -c "mongodump --host $SOURCE_MONGO_REPLICASET \
--username $SOURCE_USERNAME --password $SOURCE_PASSWORD \
--authenticationDatabase admin --gzip --archive --oplog" |
kubectl exec -i $TARGET_MONGO_POD -- \
bash -c "mongorestore --host $TARGET_MONGO_REPLICASET \
--username $TARGET_USERNAME --password $TARGET_PASSWORD \
--authenticationDatabase admin --gzip --archive --oplogReplay"
What is wrong with my approach. Why is my performance so bad?
Someone was suggesting to just copy over the /data/db
folder, which might be faster and since I need a 1:1 migration, would be sufficient.