I am currently using AWS Elastic Container Service. It is running with a HAProxy server acting as our load balancer, along with a lambda that acts as a service discovery mechanism for the servers. Our servers have a custom scaling metric based on queue times, when this metric is exceeded, it adds another server to our HAProxy load balancer and deploys a new container to that server. The server scaling works fine, where my issues lie is the container scaling.
During a scale-up the container is the first to attempt to spin up, and when it does it fails because the new server is not available. Once the server is actually available, the container might not attempt to scale up for quite a long period of time.
Key question: Is there any way to increase the frequency of which it attempts to scale up the container, or is there something special that has to be done to force the container to continue trying to scale up after a failure when using a custom load balancer?
This behaviour is also something I notice when a service has a desired count greater than the current count. It will not create new containers, unless some activity happens that causes it to update.
I know when using an application load balancer this worked without issue.