ES Docker container not joining another docker container having same cluster-name

Question

I am facing a weird issue due to docker containers of ES, I was running standalone ES 7.10 in docker container earlier on 9200 host port(internal docker ports are same as standard 9200 and 9300 ES ports), and at the same time, I started three ES docker containers(having a different minor version of 7)(they were supposed to form the cluster lets name it docker-es-cluster).

These three ES docker containers were using the host port of 9200, 9201,9202 so the ES container from the cluster which was using 9200 couldn't start due to port conflict(ES 7.10) ES container.

So I stopped the standalone 7.10 docker container and restarted the 3 cluster ES container again, but now my other 2 ES conatiners which were listening to 9201, 9202 are not getting started and their logs contain below WARNINGS:

{"type": "server", "timestamp": "2020-12-14T15:56:57,651+0000", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "docker-cluster", "node.name": "es2", "message": "master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [] to bootstrap a cluster: hfrom hosts providers and [{es2}{eBtsR2XgRVWqPdUAP_n_Ew}{tZ9FRAbPTAmZZle_5MaVoA}{172.18.0.3}{172.18.0.3:9300}{dim}{ml.machine_memory=2084032512, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0" }

After this, I stopped all docker container, deleted all docker images, did docker prune, system restart but nothing is fixing the issue, even when I am starting from clean state, it seems the cluster state is messed up for 2 ES containers and it's not getting fixed even after restarting the docker containers.

My docker-compose for ES containers looks like below and I am not using volume binding

 es2:
    image: "docker.elastic.co/elasticsearch/elasticsearch:<es-version>"
    container_name: 2
    environment:
    - node.name=2
    - cluster.name=docker-cluster
    - cluster.initial_master_nodes=1,2,3
    - discovery.seed_hosts=1,3
    - ES_JAVA_OPTS=-Xms1g -Xmx1g
    ports:
    - "9201:9200"
    networks:
    - localenv

score 0 · Answer 1 · answered Dec 15 '20 at 12:38

0

elasticsearch in new versions (I think 7.8 above), has very strict rules about joining and detaching nodes from clusters. you can not easily detach a master or even data nodes. I recommend use elasticsearch-node tool: https://www.elastic.co/guide/en/elasticsearch/reference/current/node-tool.html

the easiest way is use below command on all nodes:

elasticsearch-node detach-cluster

answered Dec 15 '20 at 12:38

hamid bayat

2,029
11
20

Thanks for your answer, how to run it for docker containers which are stopped and my docker containers are ephemeral? – Amit Dec 15 '20 at 12:49
@ElasticsearchNinja did you try docker exec to run command in container? – hamid bayat Dec 15 '20 at 13:10
you are right about ephemeral. but cluster state saved in every node's data path. – hamid bayat Dec 15 '20 at 13:16
yes, but as my docker conatinar is stopped I can't do `docker exec` and another one which is up, I went inside that docker conatiner but this command which you provided can't be run when ES is running – Amit Dec 15 '20 at 13:50

score 0 · Accepted Answer · answered Dec 31 '20 at 10:03

0

I fixed the issue, by going to docker-desktop troubleshot option and doing the clean/purge option as shown in below screen-shot.

answered Dec 31 '20 at 10:03

Amit

30,756
6
57
88

ES Docker container not joining another docker container having same cluster-name

2 Answers2