I'm load testing an Amazon Linux EC2 instance running Apache (event MPM) and PHP-FPM using Locust. When I run my load test with 200 users (~28 requests per second), everything is fine. When I boost the number of users to 300 (~43 requests per second), I start seeing these errors in the Locust logs:
ConnectionError(MaxRetryError("HTTPConnectionPool(host='xxx.xxx.xxx.xxx', port=80): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x....>: Failed to establish a new connection: [Errno 24] Too many open files'))"))
Researching online, I decided to bump up the available number of open file descriptors to see if I could get around this issue. I edited /etc/security/limits.conf
and set the following values (possibly exaggerated but I'm just trying to see if something sticks):
* soft nofile 65000
* hard nofile 65000
* soft nproc 10240
* hard nproc 10240
Afterwards, I restarted both Apache and PHP-FPM:
sudo service httpd restart
sudo service php-fpm restart
I also looked at the processes to verify the new limits and make sure they were sticking. One of Apache's child processes:
$ cat /proc/22725/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 14745 14745 processes
Max open files 170666 170666 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 14745 14745 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
And one of PHP-FPM's child processes:
$ cat /proc/22963/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 10240 10240 processes
Max open files 10240 10240 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 14745 14745 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
I've also upped the max open files at the kernel level in /etc/sysctl.conf
:
fs.file-max = 512000
Then I persisted the values with sysctl -p
. Again, this is probably egregious but I saw the same results with a value of 65000
.
Under load, I'm only seeing ~4,200 open files, which is puzzling given the overall limits I've provided:
$ lsof | wc -l
4178
During all of this, my CPU usage never goes above 20%, and my server still has around 3GB of free memory.
Any ideas?