I've got a Keycloak standalone HA cluster running on a docker host. The cluster uses JDBC Ping to a PostgreSQL database for discovery (as this will eventually be running on ECS so no multicast).
The cluster discovery works well and each node will add themselves to the database on startup. However, they aren't removing themselves when stopped with "docker stop". This is fine as long as there is at least one other node up, as they will automatically detect the downed node and rebalance, but if the last one goes down, the final row will remain. Then when a new node goes up it will attempt to connect to the stale node and fail.
The JGroups TCP stack looks as follows
<stack name="tcp">
<transport type="TCP" socket-binding="jgroups-tcp">
<property name="external_addr">
${env.EXTERNAL_ADDR}
</property>
</transport>
<protocol type="org.jgroups.protocols.JDBC_PING">
<property name="connection_driver">
org.postgresql.Driver
</property>
<property name="connection_url">
jdbc:postgresql://${env.DB_ADDR:postgres}:${env.DB_PORT:5432}/${env.DB_DATABASE:keycloak}
</property>
<property name="connection_username">
${env.DB_USER:keycloak}
</property>
<property name="connection_password">
${env.DB_PASSWORD:password}
</property>
<property name="initialize_sql">
CREATE TABLE IF NOT EXISTS JGROUPSPING ( own_addr varchar(200) NOT NULL, cluster_name varchar(200) NOT NULL, ping_data bytea DEFAULT NULL, added timestamp DEFAULT NOW(), PRIMARY KEY (own_addr, cluster_name))
</property>
</protocol>
<protocol type="MERGE3"/>
<protocol type="FD_SOCK"/>
<protocol type="FD_ALL"/>
<protocol type="VERIFY_SUSPECT"/>
<protocol type="pbcast.NAKACK2"/>
<protocol type="UNICAST3"/>
<protocol type="pbcast.STABLE"/>
<protocol type="pbcast.GMS"/>
<protocol type="MFC"/>
<protocol type="FRAG2"/>
</stack>
Dockerfile is
FROM jboss/keycloak:latest
# elevate to install iproute
USER root
RUN yum install -y iproute
USER jboss
ADD cli/* /opt/jboss/keycloak/cli/
RUN cd /opt/jboss/keycloak \
&& bin/jboss-cli.sh --file=cli/setup.cli \
&& rm -rf /opt/jboss/keycloak/standalone/configuration/standalone_xml_history
RUN sed -i -e "/.*<\/dependencies>$/i \ \ \ \ \ \ \ \ <module
name=\"org.postgresql.jdbc\"\/>"
/opt/jboss/keycloak/modules/system/layers/base/org/jgroups/main/module.xml
ADD start.sh /opt/jboss/
ENTRYPOINT [ "/opt/jboss/start.sh" ]
CMD ["-b", "0.0.0.0", "--server-config", "standalone-ha.xml"]
EXPOSE 7600
And startup.sh contains
#!/bin/sh
DEFAULT_NIC=`ip route | grep default | awk '{print $NF}'`
export EXTERNAL_ADDR=`ip -f inet -o addr show $DEFAULT_NIC | cut -d" " -f 7 | cut -d/ -f 1`
if [ "$EXTERNAL_ADDR" = "" ]; then
EXTERNAL_ADDR=127.0.0.1
fi
sh /opt/jboss/docker-entrypoint.sh $@ -Djgroups.bind_addr=$EXTERNAL_ADDR -Djboss.bind.address.private=$EXTERNAL_ADDR -Djboss.bind.address.management=$EXTERNAL_ADDR -Djgroups.bind.address=$EXTERNAL_ADDR -Djava.net.preferIPv4Stack=true -Dignore.bind.address=true
Can't really see a reason why this wouldn't be removing. Are there any obvious configuration errors I'm making here?