Apologies if this is a stupid question, first week on MQ and I've been thrown in at the deep end.
MQVersion is 3.7.6, the OS is Redhat 7.4 (Maipo) and Erlang is 9.3.2
We have 2 RabbitMQ servers in a cluster, both are AWS instances. If they are spun up, they build and join a cluster quite happily. However, if I wish to do a stop_app, reset, start_app - the start command fails claiming the address in already in use
{throw:{could_not_start,rabbitmq_management,
{rabbitmq_management,
{bad_return,
{{rabbit_mgmt_app,start,[normal,[]]},
{'EXIT',
{{could_not_start_listener,
[{port,35672}],
{shutdown,
{failed_to_start_child,ranch_acceptors_sup,
{listen_error,rabbit_web_dispatch_sup_35672,eaddrinuse}}}},
}
The port is definitely not in use, a simple port check during the start sees the port go from closed, to open (briefly) and then closed again. It seems the only way to resolve this is to restart the service and the node comes back up.
I have also tweaked the config file and changed the port, tried the start and get exactly the same error.
I do have another issue but would like to solve this one first.
This is a the contents of the rabbitmq-server.config
% This file managed by Puppet
% Template Path: rabbitmq/templates/rabbitmq.config
[
{rabbit, [
{cluster_nodes, {['rabbit@server1', 'rabbit@server2'], disc}},
{cluster_partition_handling, ignore},
{tcp_listen_options, [
{keepalive, true},
{backlog, 128},
{nodelay, true},
{linger, {true, 0}},
{exit_on_close, false}
]},
{log_levels, [{connection, debug}]},
{loopback_users, []},
{default_user, <<"admin">>},
{default_pass, <<"password">>}
]},
{kernel, [
]}
,
{rabbitmq_management, [
{listener, [
{port, 35672}
]}
]}
].
% EOF
Debugging / Log:
2018-06-14 14:09:35.712 [info] <0.33.0> Application rabbitmq_management_agent started on node rabbit@server2
2018-06-14 14:09:35.712 [debug] <0.1152.0> Supervisor rabbit_web_dispatch_sup started rabbit_web_dispatch_registry:start_link() at pid <0.1153.0>
2018-06-14 14:09:35.712 [debug] <0.1152.0> Supervisor rabbit_web_dispatch_sup started gen_event:start_link({local,webmachine_log_event}) at pid <0.1154.0>
2018-06-14 14:09:35.712 [info] <0.33.0> Application rabbitmq_web_dispatch started on node rabbit@server2
2018-06-14 14:09:35.712 [info] <0.33.0> Application amqp_client started on node rabbit@server2
2018-06-14 14:09:35.741 [debug] <0.1162.0> Supervisor {<0.1162.0>,ranch_listener_sup} started ranch_conns_sup:start_link(rabbit_web_dispatch_sup_35672, worker, 5000, ranch_tcp, 5000, cowboy_clear) at pid <0.1163.0>
2018-06-14 14:09:35.741 [error] <0.1164.0> Failed to start Ranch listener rabbit_web_dispatch_sup_35672 in ranch_tcp:listen([{port,35672}]) for reason eaddrinuse (address already in use)
2018-06-14 14:09:35.741 [error] <0.1164.0> CRASH REPORT Process <0.1164.0> with 0 neighbours exited with reason: {listen_error,rabbit_web_dispatch_sup_35672,eaddrinuse} in ranch_acceptors_sup:listen_error/4 line 59
2018-06-14 14:09:35.741 [error] <0.1162.0> Supervisor {<0.1162.0>,ranch_listener_sup} had child ranch_acceptors_sup started with ranch_acceptors_sup:start_link(rabbit_web_dispatch_sup_35672, 100, ranch_tcp, [{port,35672}]) at undefined exit with reason {listen_error,rabbit_web_dispatch_sup_35672,eaddrinuse} in context start_error
2018-06-14 14:09:35.742 [error] <0.1153.0> ** Generic server rabbit_web_dispatch_registry terminating
Any suggestions welcome...
Thanks