I'm at a loss here. I run a few dozen servers with all different amounts of traffic. As the traffic increased over time, some servers started crashing (502/503 errors) and I figured it was due the php configuration which was unchanged from default.
For instance, I set the config in /etc/php/7.0/fpm/pool.d/www.conf
as follows, based on the hardware of the server:
pm.max_children = 70
pm.start_servers = 20
pm.min_spare_servers = 20
pm.max_spare_servers = 35
pm.max_requests = 500
And after these edits, it was smooth sailing for all servers; no more spikes, no more errors. Until today I get a downtime pushmessage and /var/log/upstart/php7.0-fpm.log
reads:
[09-Jul-2018 08:51:01] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 18 idle, and 39 total children
[09-Jul-2018 08:52:03] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 14 idle, and 52 total children
[09-Jul-2018 08:52:04] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 16 children, there are 19 idle, and 58 total children
[09-Jul-2018 08:53:06] WARNING: [pool www] server reached pm.max_children setting (70), consider raising it
Which is essentially the same error I got before, but with higher numbers. I think I'm missing some other related setting to keep this in bounds, but, which one?