2

Working with mongo distributed over 3 data center

for this example the data center names are A,B,C

when every thing is going well all user traffic is pointed to A

so the mongo primary is on A, the mongo setup is :

  • 3 servers in A (with high priority)
  • 1 servers in B (with low priority)
  • 1 servers in C (priority 0 )

problem is supporting mongo-writes when 2 scenario happen:

  1. no network between A-B-C (network tunnel is down)
  2. data canter A is on fire :), lets say the data-center isnt working, in this point all user traffic is pointed to B and a primary election in B is expected.

scenario 1 isnt a problem, when no datacenter network tunnel the A still has a majority of replicas and high proirity so every thing is still working.

scenario 2 wont work, beacuse when A will stop working , all 3 replicas (on A) arent reachable, in this way no new primary will be reelacted in B or C beacuse the majority of replicas is down.

how can i setup my replica set so it supports the 2 scenarios?

Community
  • 1
  • 1
Nimrod007
  • 9,825
  • 8
  • 48
  • 71

1 Answers1

3

This is not possible: You can't have an 'available' system in case of total network partitions and in case of failure of a DC with a majority election approach as used by MongoDB: Either the majority is in one DC, then it will survive partitions but not a DC going down, or the majority requires 2 DCs to be up which can survive one DC going down but not a full network failure.

Your options:

  • Accept the partition problem and change the setup to 2-2-1. Unreliable tunnels should be solvable, if the entire network of a DC goes down you're at scenario 2.
  • Accept the DC problem and stick to your configuration. The most likely problems are probably large-scale network issues and massive power outages, not fire.
  • Use a database that supports other types of fault-tolerance. That, however, is not a panacea since this entails other tradeoffs that must be well understood.

To keep the system up when DC A goes down also requires application servers in DC B or C, which is a tricky problem in its own regard. If you use a more partition tolerant database, for instance, you could easily have a 'split brains' problem where application servers in different DCs accept different, but conflicting writes. Such problems can only be solved at the application level.

mnemosyn
  • 45,391
  • 6
  • 76
  • 82
  • is there a way i can quickly add more replica's to B ? have hidden replica's and change the config when A is down ? – Nimrod007 May 18 '15 at 05:18
  • split brains isnt a problem for me, beacuse when A isnt working all user traffic is to B – Nimrod007 May 18 '15 at 09:22
  • 1
    Sorry, missed the comment. Split brain is always a problem, because you cannot possibly know whether DC A is truly down or simply unreachable from your location. Suppose a major uplink/interconnect has a weird routing problem and all traffic from the west coast ends up in DC A, all traffic from the east coast in DC B. Both are up, but they can't see *each other*, and nobody from either west or east coast can determine that the other one is up... Network partitions, in this way, can be worse than something actually going down. – mnemosyn May 22 '15 at 09:45
  • +1 on the 2-2-1 solution. If you can swing that many nodes, this is the most graceful solution to preventing loss of majority nodes in the event a DC has been vaporized. – DCaugs Jan 14 '16 at 15:00