I have a simple rails app running on puma with an nginx proxy server in front of it configured in a standard way. They are running on an aws t2.micro instance.
The mysql db is running on another t2.micro instance.
If I run a jmeter load test for a simple login use case with 20 concurrent logins, I get the following result:
summary + 1 in 00:00:03 = 0.3/s Avg: 2542 Min: 2542 Max: 2542 Err: 0 (0.00%) Active: 20 Started: 20 Finished: 0
summary + 79 in 00:00:06 = 13.7/s Avg: 1734 Min: 385 Max: 3246 Err: 0 (0.00%) Active: 0 Started: 20 Finished: 20
summary = 80 in 00:00:09 = 9.2/s Avg: 1744 Min: 385 Max: 3246 Err: 0 (0.00%)
When I run the same test with 100 concurrent logins, I get the following result:
summary + 362 in 00:00:14 = 25.0/s Avg: 2081 Min: 381 Max: 9730 Err: 0 (0.00%) Active: 21 Started: 100 Finished: 79
summary + 38 in 00:00:13 = 3.0/s Avg: 4887 Min: 625 Max: 17995 Err: 0 (0.00%) Active: 0 Started: 100 Finished: 100
summary = 400 in 00:00:27 = 14.8/s Avg: 2347 Min: 381 Max: 17995 Err: 0 (0.00%)
The avg and max response time goes up by a factor of 2-5. This is not a big surprise, but I cannot find the bottleneck when I look at the server CPU and Memory load. The max CPU usage in the test timeframe is 36% and memory consumption is almost not changing at all (up 5MB).
My questions are: Where is the actual bottleneck? What is the scaling strategy? Put the puma workers on seperate EC2 instances?
I am not very experienced with setting up a such a server, so all hints are welcome.