4

I've built a docker swarm cluster (6 nodes). Although I was able to build resilient services (several nodes answering requests for the same service with several instances), I cannot find a way to build a high-availability IP level service (using a single public IP address) in case of manager node failure.

Is the docker swarm cluster reference architecture built to be always behind an external load balancer (IP or DNS) or reverse proxy ? Or a software based old-school vIP (pacemaker) ?

I have the feeling that my cluster is no self resilient (in the same way that my vSphere Cluster provides high-availability for VMs) and that I'm always putting somewhere my single point of failure.

Is there a way to bridge docker swarm hosts public network ?

1 Answers1

3
  1. You should have a minimum of 3 managers for them to be HA.
  2. You didn't mention your hosting environment, but sharing a single IP among many hosts is an OS-level feature that isn't supported by most cloud hosters.
  3. If you're just publishing web apps to users external to your Swarm, you can rely on multiple A records (one per public IP of each Swarm host you want to receive traffic) for DNS Round Robin Failover. All modern browsers support this as an HA method and the client will pull all A records and auto-retry new ones in the list if one becomes unresponsive.
  4. Most cloud hosters have a load-balancer option that provides a single-IP to route to your Swarm nodes. This is what most people do it seems, and is what's used with the Docker for AWS/Azure templates (from https://store.docker.com)

I talk about a lot of these topics in my "Taking Docker to Production" session from DockerCon EU 2017 (sorry the website requires email to watch).

Bret Fisher
  • 8,164
  • 2
  • 31
  • 36
  • Thank you. As said in my post, I was almost sure docker it self didn't provided os-level fault tolerance. You've asked about my hosting environment: I've installed two ESXi hosts managed by a vCenter. For very specific reasons, I don't have shared storage between nodes so I can't enable vSphere HA. I thought that a docker orchestrating solution could provide, via glusterfs ans high-availability, an elegant solution for very basic services (old-school dns server, dhcp, ldap, etc) to a legacy application. – Fabio Barcelos Feb 02 '18 at 09:49
  • Yea for shared storage with docker you would look at Volume Plugins, which are listed on https://store.docker.com – Bret Fisher Feb 02 '18 at 19:34