0

Yesterday i found out that secondary member in repset is out of sync with primary. Difference was too big to fit the oplog so i had to sync manually. According to the manual i stopped mongod, entirely deleted content of dbpath and started mongod againg. It started with status "STARTUP2" and began the initial sync. There were more than 200G to sync so i leave it syncing and went home. Today i saw that status was changed from "STARTUP2" to "SECONDARY"

 "_id" : 0,
"name" : "lab7-mongo-4:27020",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 71414,
"optime" : Timestamp(1448011826, 2),
"optimeDate" : ISODate("2015-11-20T09:30:26.000Z"),
"lastHeartbeat" : ISODate("2015-11-20T09:38:41.000Z"),
"lastHeartbeatRecv" : ISODate("2015-11-20T09:38:41.000Z"),
"pingMs" : 0,
"syncingTo" : "lab7-mongo-9:27020"

but dbpath size on secondary is 109G vs 205G on primary. As i inderstand it should still syncing, but it is not. Size of dbpath on secodary didn't increase during last two hours. Please advise how to finish the sync.

user1700494
  • 211
  • 2
  • 15

1 Answers1

1

Actually, this expected behavior.

When data gets written, documents might be moved to new datafiles. While documents are guaranteed not to be fragmented, the individual documents can become scattered over the datafiles, especially when the data is highly volatile.

When a document is deleted, its space is available for new documents, but only if those documents fit into that location. If not, new datafiles may be allocated. So in theory it might well happen that only a fraction of the space in the datafiles is used, but still new datafiles get allocated because the new documents do not fit into a "slot".

During a sync, however, documents are written contiguous into newly allocated data files, potentially (and most likely) reducing the size MongoDB requires for its datafiles.

The change from "STARTUP2" to "SECONDARY" translates to

I have copied over all documents present when the sync started and applied all oplog entries from the beginning of the sync to its finish. I am guaranteeing consistency.

So, you have basically defragmented your datafiles, resulting in fewer ones to be allocated.

Markus W Mahlberg
  • 19,711
  • 6
  • 65
  • 89
  • Does the primary ever perform defragmentation of its datafiles? – hmjd Nov 20 '15 at 11:21
  • @hmjd Nope. The thing is that defragmentation would stall normal operations a great deal. And usually, the "problem" balances over time, so that even highly volatile data will cease to create new datafiles because a "sweet spot" was found (assuming the data size stays roughly the same). – Markus W Mahlberg Nov 20 '15 at 11:23
  • @hmjd However, you can have your primary step down, making it a secondary, delete it's `--datadir` and have it resync. Note that this usually isn't necessary, though, slows down the replica set a bit and reduces redundancy just for the sake of reducing the utilization of rather cheap disk space. – Markus W Mahlberg Nov 20 '15 at 11:26
  • Thanks. I was curious if it performed it during a period of inactivity. +1 btw – hmjd Nov 20 '15 at 11:27
  • Darn, we are on SO. I am so sorry. Question along with it's answer belongs to http://dba.stackexchange.com – Markus W Mahlberg Nov 20 '15 at 11:30