Only a partial solution here.
So, when attaching your services to a custom overlay network, you indeed have access to Docker's custom Service discovery feature. I'll detail the networking feature of Docker Swarm mode, before trying to tie it to your problem.
I'll be using the different term of services and tasks, in which a service could be elasticsearch, whereas a task is a single instance of that elasticsearch service.
Docker networking
The idea is that for each services you create, docker assigns a Virtual IP (VIP), and a custom dns alias. You can retrieve this VIP using the docker service inspect myservice
command.
But, there is two modes to attach a service to an overlay network dnsrr and VIP. You can select these options using the --endpoint-mode
options of docker service create
.
The VIP mode (I believe it is the default one, or at least the most used), affects the virtual ip to the service's dns alias. This means that doing an nslookup servicename
would return to you a single vip, that behind the scenes, would be linked to one of your container in a round robin fashion. But, there is also a special dns alias that lets you access all of your instances ips (all of your tasks ips) : tasks.myservice
.
So in VIP mode you can retrieve all of your tasks ips using a simple nslookup tasks.myservice
, where myservice is a service name.
The other mode is dnsrr. This mode simply gets rid of the VIP, and connects the dns alias to the different tasks (=service instances), in a round robin way. This way, you simply have to do a nslookup myservice
to retrieve the different service instances ip.
Elasticsearch clustering
Ok so first of all I'm not really familiar with the way elasticsearch lets you cluster. From what I understood from your question, you need when running the elasticsearch binary, give it as a parameter, the adress of all of the other nodes it needs to cluster with.
So what I would do, is to create a custom Elasticsearch image, probably based on the one from the default library, to which I would add a custom Entrypoint
that would firstly run a script to retrieve the other tasks ip.
I'd believe that staying in VIP mode is suitable for you, since there is the tasks.myservice
dns alias. You'll then need to parse the output to retrieve the tasks ip (and probably remove yours). Then you'll be able to save them in a config file environment variable, or use them as a runtime option for your elasticsearch binary.
Edit: To create a custom overlay network, you will need to use the docker network create
command, and use the --network
option of docker service create
This is answer is mainly based on the Swarm mode networking documentation