3

I have a mongodb production cluster running in 2.6.11 with 20 replicatSets. I getting space disk issue, because the chunks majority are store in one replicatSet. When I check the log, I can see that move chunk failed because of "deletes from previous migration"

2015-12-28T17:13:32.164+0000 [conn6504] about to log metadata event: { _id: "db1-2015-12-28T17:13:32-56816dbc6b0464b0a5801db8", server: "db1", clientAddr: "xx.xx.xx.11:50077", time: new Date(1451322812164), what: "moveChunk.start", ns: "emailing_nQafExtB.reports", details: { min: { email: "xxxxxxx" }, max: { email: "xxxxxxx" }, from: "shard16", to: "shard22" } } 2015-12-28T17:13:32.675+0000 [conn6504] about to log metadata event: { _id: "db1-2015-12-28T17:13:32-56816dbc6b0464b0a5801db9", server: "db1", clientAddr: "xx.xx.xx.11:50077", time: new Date(1451322812675), what: "moveChunk.from", ns: "emailing_nQafExtB.reports", details: { min: { email: "xxxxxxx" }, max: { email: "xxxxxxx" }, step 1 of 6: 3, step 2 of 6: 314, note: "aborted", errmsg: "moveChunk failed to engage TO-shard in the data transfer: can't accept new chunks because there are still 1 deletes from previous migration" } }

I follow the answer from this question, but doesn't work for me. I run stepDown command on one primary and all my cluster primary. I do the same with the cleanUpOrphaned command.

Does somedody run over this problem ?

Thanks in advance for any insights.

  • Do you use noTimeout cursors? If yes and these cursors are not closed they can block deletion of data post migration. – James Wahlin Dec 28 '15 at 17:36
  • It was the original problem. We see this and fix it ( we add timeout option in our code ). After this fix, I have done a stepDown on all the primary. The deletes number decrease, but look like some still lock. – Christophe Biguereau Dec 29 '15 at 09:19
  • I double check on each primary log file, cleanUpOrphaned command delete none documents: conn206432] rangeDeleter deleted 0 documents for emailing_nQafExtB.reports from { email: MinKey } -> .... – Christophe Biguereau Dec 29 '15 at 10:50
  • Are the chunk migrations occurring now? If not you can run db.currentOp() on each of the primaries to look for blocking operations. – James Wahlin Dec 29 '15 at 14:16
  • No, the chunk migration still lock, but when I check the current op they are no activity on this db. – Christophe Biguereau Dec 29 '15 at 17:19
  • @ChristopheBiguereau Did you find a solution to this? – ankshah Aug 17 '17 at 05:07

0 Answers0