2

I have a request of periodical data archival of a large collection. A few points of the requirement is causing some concerns from me:

  1. Data archived should be moved into a different database for equal application access as production.
  2. The frequency of backup can be monthly, yearly or even less frequent.
  3. Archival should happen without interrupting production system's availability and performance.

This requirement means a large volume insert and delete during a short time-frame when the archive is performing. It has a few challenges to be addressed.

  1. For the same record, deletion can only happen and have to happen after success of insertion.
  2. Large volume of insertion and deletion may blow the oplog before replication is completed to the secondary node in replicaset. After all, oplog size is generally configured for daily routine operation.

For challenge 1, Mitch Pronschinske has suggested a solution that is quite close here. Mitch's archive function addresses issue that the deletion can only happen after success of insertion, but not the "have to happen" part. Nevertheless, this is very close and addressing the "have to happen" part is possible on top of this script.

It's challenge 2 that causes headache to me. According to MongoDB's instruction, changing oplog size requires downtime and manual intervention. This is unlikely an option in consideration of requirement 3.

Would anyone have any experience or advice on how this can be achieved? Thanks!

My environment information:

  • MongoDB 3.2
  • 3 shards
  • Replicaset: 3 data members, 1 member is geographically redundant with priority 0.
  • OS:MS Win2012R2
Vince Bowdren
  • 8,326
  • 3
  • 31
  • 56
Lee
  • 2,874
  • 3
  • 27
  • 51

0 Answers0