RabbtiMQ cluster: Node reported as down when checking on other nodes

Question

After joining a RabbitMQ node to the cluster, we noticed that other nodes in the cluster marked it as down, under alarms:

root@rabbit2:# rabbitmqctl cluster_status
Cluster status of node 'rabbit@rabbit2' ...
[{nodes,[{disc,['rabbit@rabbit1','rabbit@rabbit2',
                'rabbit@rabbit3']}]},
 {running_nodes,['rabbit@rabbit1','rabbit@rabbit3','rabbit@rabbit2']},
 {cluster_name,<<"rabbit@rabbit2.domain.local">>},
 {partitions,[]},
 {alarms,[{'rabbit@rabbitmq-1',[nodedown]},
          {'rabbit@rabbitmq-3',[]},
          {'rabbit@rabbitmq-2',[]}]}]

Checking on the nod in question, there is no issue:

root@rabbit1:# rabbitmqctl cluster_status

Cluster status of node 'rabbit@rabbitmq-1' ...
[{nodes,[{disc,['rabbit@rabbit1','rabbit@rabbit2',
                'rabbit@rabbit3']}]},
{running_nodes,['rabbit@rabbit2','rabbit@rabbit3','rabbit@rabbit1']},
{cluster_name,<<"rabbit@rabbit2.domain.local">>},
{partitions,[]},
{alarms,[{'rabbit@rabbit2',[]},
         {'rabbit@rabbit3',[]},
         {'rabbit@rabbit1',[]}]}]

We have tried purging the install of rabbitmq and all it's erlang components and starting anew, but the node still gives the same issue.

maker · Answer 1 · 2017-01-13T10:00:26.407

I hit the same issue. The only way how clear the alarm is remove node from the cluster and join it again. But the alarm appear again after some time, always on the same node. Strange that node looks that works normally even if this alarm is set.

I investigate deeper and probably found the solution:

make sure that hostnames rabbit1, rabbit2 and rabbit3 are resolvable on all cluster hosts.
I have disabled ipv6 on one cluster host (which was reported as down). Ipv6 settings (enabled/disabled) should be the same on all host.

RabbtiMQ cluster: Node reported as down when checking on other nodes

1 Answers1