8

I have a 3 node docker swarm cluster. We might want to have 2 managers. I know at one time there is only one leader. Since it is a 3 node cluster, I am trying to find some literature to understand what are the pros and cons of multiple managers. I need this info since in my 3 node cluster if I have 2 masters, 1 worker, what is the downside if I simply create 3 masters in a cluster. Any thoughts would be helpful.

curiousengineer
  • 2,196
  • 5
  • 40
  • 59

1 Answers1

17

A Docker swarm with two managers is not recommended.

Why?

Docker swarm implements a RAFT consensus:

Raft tolerates up to (N-1)/2 failures and requires a majority or quorum of (N/2)+1 members to agree on values proposed to the cluster. This means that in a cluster of 5 Managers running Raft, if 3 nodes are unavailable, the system will not process any more requests to schedule additional tasks

So with 2 managers, if one is down, the other will not be able to schedule additional tasks (no cluster upgrades, no new services, etc...).

The docs is also clear about the number of managers you should have for high availability :

Size your deployment

To make the cluster tolerant to more failures, add additional replica nodes to your cluster.

Manager nodes Failures tolerated
    1                 0
    3                 1
    5                 2
    7                 3

So in brief, as the doc states here:

Adding more managers does NOT mean increased scalability or higher performance. In general, the opposite is true.

Community
  • 1
  • 1
François Maturel
  • 5,884
  • 6
  • 45
  • 50
  • So what happens if the # of managers is odd, say 3 (as recommended), then one of them goes down? Now we have 2 managers which can't elect a leader. Is it still considered to be "Failure tolerant"? – Ivan Kleshnin Aug 24 '22 at 15:50
  • 1
    @IvanKleshnin No It is not anymore "Failure tolerant" (after loosing the 3rd node, you can’t add or remove nodes until you recover the unavailable manager nodes and also can't recover the swarm with disaster recovery commands). The op should take care of the cluster and ensure the third manager is up again, all the swarm manager command will be available again then. – François Maturel Aug 25 '22 at 07:12