5

Setup: replica set with 5 nodes, version 3.4.5.

Trying to switch PRIMARY with rs.stepDown(60, 30) but consistently getting the error:

rs0:PRIMARY> rs.stepDown(60, 30)
{
    "ok" : 0,
    "errmsg" : "No electable secondaries caught up as of 2017-07-11T00:21:11.205+0000. Please use {force: true} to force node to step down.",
    "code" : 50,
    "codeName" : "ExceededTimeLimit"
}

However, rs.printSlaveReplicationInfo() running in a parallel terminal confirms that all replicas are fully caught up:

rs0:PRIMARY> rs.printSlaveReplicationInfo()
source: X.X.X.X:27017
    syncedTo: Tue Jul 11 2017 00:21:11 GMT+0000 (UTC)
    0 secs (0 hrs) behind the primary
source: X.X.X.X:27017
    syncedTo: Tue Jul 11 2017 00:21:11 GMT+0000 (UTC)
    0 secs (0 hrs) behind the primary
source: X.X.X.X:27017
    syncedTo: Tue Jul 11 2017 00:21:11 GMT+0000 (UTC)
    0 secs (0 hrs) behind the primary
source: X.X.X.X:27017
    syncedTo: Tue Jul 11 2017 00:21:11 GMT+0000 (UTC)
    0 secs (0 hrs) behind the primary

Am I doing something wrong?

UPD: I've checked long running operations before and during rs.stepDown as was suggested below and it looks like this:

# Before rs.stepDown
$ watch "mongo --quiet --eval 'JSON.stringify(db.currentOp())' | jq -r '.inprog[] | \"\(.secs_running) \(.desc) \(.op)\"' | sort -rnk1"
984287 rsSync none
984287 ReplBatcher none
67 WT RecordStoreThread: local.oplog.rs none
null SyncSourceFeedback none
null NoopWriter none
0 conn615153 command
0 conn614948 update
0 conn614748 getmore
...

# During rs.stepDown
984329 rsSync none
984329 ReplBatcher none
108 WT RecordStoreThread: local.oplog.rs none
16 conn615138 command
16 conn615136 command
16 conn615085 update
16 conn615079 insert
...

Basically, long running user operations seem to happen as a result of rs.stepDown() as secs_running becomes nonzero once PRIMARY attempts to switch over and keeps growing all the way up until stepDown fails. Then everything gets back to normal.

Any ideas on why this happens and whether that's normal at all?

Dmitry
  • 51
  • 1
  • 4
  • 1
    This question isn't [on-topic for StackOverflow](https://stackoverflow.com/help/on-topic) and belongs on DBA StackExchange. Your related question there actually has all of the required context about your environment: [MongoDB hangs up on shutdown](https://dba.stackexchange.com/questions/179616/mongodb-hangs-up-on-shutdown/180379). – Stennie Jul 21 '17 at 01:57
  • **This issue has been fixed in the version 3.4.6.** See [this question](https://dba.stackexchange.com/questions/179616/mongodb-hangs-up-on-shutdown/180379) for more context. – Dmitry Jul 21 '17 at 01:06

4 Answers4

3

I have used below command to step down to secondary

db.adminCommand( { replSetStepDown: 120, secondaryCatchUpPeriodSecs: 15, force: true } )

You can find this in below mongodb official documentation

https://docs.mongodb.com/manual/reference/command/replSetStepDown/

Amol Suryawanshi
  • 2,108
  • 21
  • 29
1

To close the loop on this question, it was determined that the failed stepdown was due to time going backward on the host.

MongoDB 3.4.6 is more resilient to time issues on the host, and upgrading the deployment fixes the stalling issues.

kevinadi
  • 13,365
  • 3
  • 33
  • 49
0

Before stepping down, rs.stepDown() will attempt to terminate long running user operations that would block the primary from stepping down, such as an index build, a write operation or a map-reduce job.

Do you have some long running jobs on going? Check db. Check result of db.currentOp()

You can try to set longer stepping down time rs.stepDown(60, 360).

JJussi
  • 1,540
  • 12
  • 12
  • Thanks for the fast response! I've checked long running operations during `rs.stepDown` and updated the question. However, bases on what I see, it does not seem like increasing stepping down time will help in this case... – Dmitry Jul 11 '17 at 22:32
  • In your replica set, how many nodes you have totally? If you are watching (tail -f) mongod.log during step down, what it tells about reason why stepping down was failed? – JJussi Jul 12 '17 at 04:20
  • There are 5 nodes total. Log is not very informative in this case: it just says "Attempting to step down in response to replSetStepDown command", then a series of "end connection" / "connection accepted" messages, and when the timeout expires MongoDB just gets back to processing queries. I guess, increasing logging level may potentially add some more color here. But since the primary does not switch over normally, I'm not sure if forcing it to do so is safe given this is a production system. What would you suggest? – Dmitry Jul 12 '17 at 05:30
  • Check your NTP on all nodes. If clocks are NOT at same time, you cannot step down. You can always try to stop all writing/changes to DB with `db.fsyncLock()`, but you need to manage to close that node quite fast after that because you are at production and new primary must be selected. – JJussi Jul 12 '17 at 07:12
  • `timedatectl` says `Network time on: yes`, `NTP synchronized: yes`. Both local and universal time look consistent between the nodes. However, RTC time is inconsistent and differs a lot from node to node: by several minutes, days, and even months. Can this be the problem? – Dmitry Jul 12 '17 at 19:13
0

Quoting an answer from https://jira.mongodb.org/browse/SERVER-27015:

This is most likely due to the fact that by default the shutdown command will only succeed on a primary if the secondaries are fully caught up at the exact moment that the shutdown command is executed.

I faced a similar issue and tried the db.shutdownServer() command several times, however it worked exactly when the secondary was 0 seconds behind the primary.

Yahya
  • 520
  • 2
  • 10
  • 26