I'm running RabbitMQ Docker image (rabbitmq:3-management) in AWS ECS. It's working fine with no issues.
Then I added a bit more complexity and created a service with the same RabbitMQ but now connected to AWS Network Load Balancer (my ultimate goal is to create a RabbitMQ cluster, so I need a few instances behind load balancer). Target group is configured with port 5672 and uses the same port for health checks. Interval between health checks is 30 sec (it's max available). Threshold is 5. In configuration of service in ECS Health check grace period is 120 sec. Should be enough to start service. What happens is that when I run service after a few minutes it gets killed and restarted:
service Rabbit-master (instance i-xxx) (port 5672) is unhealthy in target-group Rabbit-cluster-target-group due to (reason Health checks failed)
'A few minutes' means 2 or 5 or 9... It varies. It doesn't happen on a start but after a while. Also I see that RabbitMQ works fine (in logs and in management panel). So it's exactly ELB which causes its restart. Not that first RabbitMQ died and then ELB restarted it, no.
So my question is what I'm doing wrong and how I can achieve stable work of RabbitMQ in ECS in pair with ELB? Is the idea to use port 5672 for helth checks wrong? But which port then to use? 15672?
Sorry if I provided not enough details. I desribed those which seemed to me relevant. If you need anything more I will be happy to elaborate. Thanks!