We have a system built on AWS. We use Beanstalk, we have autoscaling, our database (mysql) is hosted on RDS. We use apache and php. We wanted to test our system on highload. So, we chose large instances for backend (4 CPUs, 15Gb of RAM - 20 instances) and big instance for RDS (8 CPUs, 30 Gb of RAM). And we ran the marketing campaign - many many users came to our website. We were checking latency all the time. And then suddenly it increased to 7 seconds. I would understand if that happened because CPU load was 100% or no free memory. But no, CPU utilization on apache servers was ~50%, on RDS server ~20%. Requests to database - ~20 per second. Enough memory. So I don't know why the latenncy increased. Steps I made for investigations:
- I saw error "Too many connections". After that I increased max_connections option in RDS
- I increased the number of users apache can serve. Using this article: http://www.genericarticles.com/mediawiki/index.php?title=How_to_optimize_apache_web_server_for_maximum_concurrent_connections_or_increase_max_clients_in_apache
But the problem still exists. I don't know how to fix that. Why the latency value increases when there's enough resources to handle everything? Please, help. Thank you.