I'm hosting a shiny app on ECS Fargate. It works fairly well but then occasionally when using the app it crashes. I traced it to the following in the events tab:
service YYYY has started 1 tasks: task XXX
service YYYY has stopped 1 running tasks: task XXX
service YYYY deregistered 1 targets in target-group (Name of Elastic Load Balancer)
service YYYY (port 3838) is unhealthy in target-group (Name of Elastic Load Balancer) due to (reason Request timed out).
Does anyone know what might be causing this? Or alternatively how can I investigate this further?
Could this be linked to spikes in CPU utilization within the application?
I've seen that at certain times the CPU utilization is spiked to 100%. So If the user uses the application in a way that causes this high utilization, could this cause the container to be deemed unhealthy?
Also, auto-scaling is enabled for the application for when the CPU > 50% - however this is not being activated in the moments when the CPU utilization spikes to 100%. Any ideas?