We are using a MongoDB replica set with three nodes. The database is quite large 2+ billion records and occupies 700GB on a disk (WiredTiger MongoDB engine). Mostly on documents are performed inserts (several millions per day) and after that reads and updates.
After replacing a disk on a secondary member the data folder was empty and initial sync started. By looking at the logs it took about 7 hours to copy records and then 30 hours to build the indexes, but this was way too much for oplog to contain all the records that were inserted/updated in the meantime:
2016-11-16T23:32:03.503+0100 E REPL [rsBackgroundSync] too stale to catch up -- entering maintenance mode
2016-11-16T23:32:03.503+0100 I REPL [rsBackgroundSync] our last optime : (term: 46, timestamp: Nov 15 10:03:15:8c)
2016-11-16T23:32:03.503+0100 I REPL [rsBackgroundSync] oldest available is (term: 46, timestamp: Nov 15 17:37:57:30)
2016-11-16T23:32:03.503+0100 I REPL [rsBackgroundSync] See http://dochub.mongodb.org/core/resyncingaverystalereplicasetmember
First we restarted this member and a re-sync started:
2016-11-16T23:47:22.974+0100 I REPL [rsSync] initial sync pending
2016-11-16T23:47:22.974+0100 I REPL [ReplicationExecutor] syncing from: x3:27017
2016-11-16T23:47:23.219+0100 I REPL [rsSync] initial sync drop all databases
2016-11-16T23:47:23.219+0100 I STORAGE [rsSync] dropAllDatabasesExceptLocal 5
2016-11-16T23:53:09.014+0100 I REPL [rsSync] initial sync clone all databases
By looking at the data folder, all the files were erased and they started to grow. But after some 8 hours it barely resynced 5% of the database.
What approach to use for such large syncs?
We thought to increase the oplog size, but that would require a downtime of the entire replica set. What approaches can we use without having a downtime?