6

I have a two member replica set (within a sharded cluster), with no arbiter. I know that the recommended design is to have three servers (or an odd number) but right now I have to stick with this design.

The problem is that when the primary is down, the secondary doesn't get elected as a temporary master (since the priority for both servers is 1 and I have no arbiter).

Is there any way to get that automatic failover (slave becomes primary -> primary back online -> default primary gets its previous state)?

Thank you!

MariusP
  • 472
  • 4
  • 17

2 Answers2

7

You MUST have at least 3 members in the replicaset, having 2 members is worse than having just a single node (NEVER EVER DO THIS!!). Your problem is worst than you think, the main problem is that if your secondary goes down, your primary will demote itself to a secondary and also effectively go down because you have no more primary. I don't know why you 'have to stick with this configuration, but it can not be a solution.

The solution to this is to create an arbiter, node, this is simply a lightweight member of the replicaset that has no actual data and is simply there to take part in the election for a new primary if one of the other members goes down.

Abdullah Khawer
  • 4,461
  • 4
  • 29
  • 66
Clinton Bosch
  • 2,497
  • 4
  • 32
  • 46
  • 1
    How is having 2 members worse than one member? If one server fails completely and loses your data reasoning has it that your data could still exist on the backup server. – Sammaye Oct 07 '13 at 08:23
  • 5
    Purely from an uptime perspective, by having a 2 member replicaset you have doubled the chances of your db going down, because if either server (primary or secondary) goes down then your database is effectively down since there is no way of getting a primary. You are correct though, if one of the servers actually dies and the data is not recoverable, then having the second server will save you, but from an availability perspective, 2 is worst than 1 – Clinton Bosch Oct 07 '13 at 08:34
  • The problem is that I only have two physical machines (that's the only limitation I cannot change). Of course I can use an arbiter on one of those two machines. But then, if the arbiter is on the primary machine and if that whole machine goes down, does the single slave become a master until the primary and the arbiter got restored? – MariusP Oct 07 '13 at 09:00
  • No, the secondary will not get promoted to primary in this situation – Clinton Bosch Oct 07 '13 at 10:50
  • 1
    You need a majority vote, so an even number of machines doesn't improve availability. You might want to add a third very cheap virtual machine if any possible, because the arbiter can easily work on a tiny linux box with, say 256MB of RAM, probably a lot less. Alternatively, virtualize the deployment but that doesn't protect you from h/w failures. – mnemosyn Oct 07 '13 at 10:52
  • Now I have a three-member replica-set: one arbiter, one primary and one secondary. If the primary and the arbiter are both down, the only remaining secondary does not vote itself and/or become primary (since it's the only server in that replica set). Is this a limitation of MongoDB clustering? – MariusP Oct 08 '13 at 12:51
  • You will probably find the guys from mongo choking at the word limitation, if you get into split-brain theory then they would argue that it is by design. In short, it might be that the primary is actually up but the connection between primary and secondary has just gone down so if the secondary promoted itself to primary (because it thought all the other servers were down) then the application might see 2 primary's and chaos would prevail – Clinton Bosch Oct 09 '13 at 13:53
  • Priority 0 members cannot trigger an election if they go down. Such members can serve as simple data replication members. – Rondo Mar 01 '15 at 02:52
0

Unfortunatly to get an elected primary a majority of the set must be online.

There is a specifric word for the type of election process that is used but MongoDBs documentation, from what I can see, has been dumbbed down to remove this information which isn't helpful at all; infact I beleive it is http://en.wikipedia.org/wiki/Quorum

50% is not enough of the members.

Sammaye
  • 43,242
  • 7
  • 104
  • 146