logrotate causes php5-fpm downtime

Question

I've noticed that one of our servers starts returning errors just after logrotate runs, i.e. in nginx error log I can see:

[error] 8501#0: *118126869 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: xxx.yyy.zz.ww, server: www.test.com, request: "GET /index.php HTTP/1.1", upstream: "fastcgi://127.0.0.1:9011", host: "www.test.com"

I have tried adding a postrotate action that would make the php reload gracefully but the error is still happening, our current logrotate is as follows:

/var/log/php5-fpm.log {
        daily
        missingok
        rotate 52
        compress
        delaycompress
        notifempty
        create 644 root root
        postrotate
                [ ! -f /var/run/php5-fpm.pid ] || kill -USR2 `cat /var/run/php5-fpm.pid`
        endscript
}

PHP config is as follows:

[www-9011]

user = www-data
group = www-data
listen = 127.0.0.1:9011
listen.backlog = 65535
pm = ondemand
pm.max_children = 50
pm.process_idle_timeout = 10s;
pm.max_requests = 500
rlimit_files = 16384
chdir = /
catch_workers_output = no
php_admin_value[error_log] = /var/log/fpm-php.www.log
php_admin_flag[log_errors] = on

We're running on ubuntu 12.04 and php 5.3.10

it seems to be. a new log file for php gets created, say at 06.46, and the nginx error log throws a few 104: Connection reset by peer immediately after. The same thing happens on all of our servers running this setup, and there's always a short outage just after logrotate runs. — Kasia Gogolek, Nov 02 '15 at 11:31
reloading php gracefully doesn't cause any outage btw. to confuse things further I can't reproduce it either if I run sudo logrotate --force /etc/logrotate.d/php5-fpm — Kasia Gogolek, Nov 02 '15 at 11:34

score 3 · Answer 1 · answered Jan 20 '18 at 22:36

Send USR1 instead

https://github.com/php/php-src/blob/b7a7b1a624c97945c0aaa49d46ae996fc0bdb6bc/sapi/fpm/fpm/fpm_events.c#L94

The source code shows this is specifially for rotating files, I know Ubuntu 14.04 didn't handle fpm reloads (USR2) very well, I assume its the same for older versions too.

So change to

postrotate
                [ ! -f /var/run/php5-fpm.pid ] || kill -USR1 `cat /var/run/php5-fpm.pid`
endscript

to simply rotate the logs

BurninLeo · Answer 2 · 2023-08-20T12:15:14.467

This is an old, but still unanswered question. Therefore, I try to give an answer to those still searching for: The logrotate configuration file sais that after doing the rotation, logrotate shall kill the php-fpm process:

        postrotate
                [ ! -f /var/run/php5-fpm.pid ] || kill -USR1 `cat /var/run/php5-fpm.pid`

The current default configuration in the Ubuntu 16 repositories is to call a feature php5-fpm-reopenlogs that is intended for logfile rotation:

        postrotate
                /usr/lib/php5/php5-fpm-reopenlogs

Similar with PHP 7:

        postrotate
                /usr/lib/php/php7.0-fpm-reopenlogs

Here's a complete /etc/logrotate.d/php5-fpm to show this postrotate in context:

/var/log/php5-fpm.log {
        rotate 12
        weekly
        missingok
        notifempty
        compress
        delaycompress
        postrotate
                # invoke-rc.d php5-fpm reload > /dev/null
                /usr/lib/php5/php5-fpm-reopenlogs
        endscript
}

USR2 is to reload, USR1 is to reopen log files and is the one that should be used during postrotate — darryn.ten, Aug 14 '23 at 05:04
Thanks, confirmed that here (https://linux.die.net/man/8/php-fpm) and corrected the above script. — BurninLeo, Aug 20 '23 at 12:16

score 0 · Answer 3 · answered Aug 20 '23 at 13:02

It seems like you're encountering errors in your server's nginx error log after the logrotate process has run. The error message indicates an issue related to the "recv()" function and mentions a connection reset by the peer. This occurs while the server is trying to read a response header from an upstream server. The log also provides information about the client's IP address, the server's domain, the type of request, and the upstream server's details.

This type of error is often related to the communication between nginx and a FastCGI backend (in this case, the upstream server at "127.0.0.1:9011"). The "Connection reset by peer" error indicates that the upstream server abruptly closed the connection while nginx was expecting a response. This can happen for various reasons, such as the upstream server crashing or the connection being dropped due to network issues.

To troubleshoot and resolve this issue, you might consider the following steps:

Check Upstream Server: Verify that the FastCGI backend (127.0.0.1:9011) is running and functioning properly. Check its logs for any errors or crashes.
Network Connectivity: Examine the network connectivity between nginx and the upstream server. Make sure there are no network issues causing connection interruptions.
Resource Limitations: Ensure that the upstream server has enough resources (CPU, memory, etc.) to handle incoming requests. Resource constraints could lead to crashes or connection resets.
FastCGI Configuration: Review the FastCGI configuration in your nginx settings. Make sure the settings are correctly configured to match the FastCGI server's parameters.
Update Software: Ensure that both nginx and the FastCGI backend are running the latest stable versions. Software updates can often address known issues.
Error Handling: Implement proper error handling in your code to gracefully handle unexpected situations. This can prevent the upstream server from crashing and abruptly closing connections.
Load Balancing: If feasible, consider implementing load balancing with multiple upstream servers to distribute traffic and reduce the impact of one server's failure.
Monitor and Analyze: Set up monitoring tools to keep track of server performance and errors. Analyzing trends and patterns can provide insights into the root cause of the issue.
Logs: Examine the logs of both nginx and the upstream server for any additional error messages or relevant information that might shed light on the problem.
Configuration Review: Double-check your nginx and FastCGI configurations for any misconfigurations or inconsistencies that could be causing the issue.

Remember to implement changes one at a time and test after each modification to identify the specific action that resolves the problem.

logrotate causes php5-fpm downtime

3 Answers3