Why does nginx automatically restart (and fail to do so)

Question

Once in a while, for a reason I don't understand (I'm a very amateurish admin), nginx automatically restarts, fails to do so, which results in an interruption of service. It happened this morning:

$ journalctl -u nginx
-- Logs begin at Mon 2018-09-03 11:46:24 CEST, end at Tue 2018-09-04 09:30:22 CEST. --
Sep 04 07:55:21 vpsXXXXXX.ovh.net systemd[1]: Stopping A high performance web server and a reverse proxy server...
Sep 04 07:55:21 vpsXXXXXX.ovh.net systemd[1]: Stopped A high performance web server and a reverse proxy server.
Sep 04 07:55:27 vpsXXXXXX.ovh.net systemd[1]: Starting A high performance web server and a reverse proxy server...
Sep 04 07:55:27 vpsXXXXXX.ovh.net nginx[29333]: nginx: [warn] could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_siz
Sep 04 07:55:27 vpsXXXXXX.ovh.net nginx[29335]: nginx: [warn] could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_siz
Sep 04 07:55:28 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use)
Sep 04 07:55:28 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Sep 04 07:55:28 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:5281 failed (98: Address already in use)
Sep 04 07:55:28 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use)
Sep 04 07:55:28 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Sep 04 07:55:28 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:5281 failed (98: Address already in use)
Sep 04 07:55:29 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use)
Sep 04 07:55:29 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Sep 04 07:55:29 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:5281 failed (98: Address already in use)
Sep 04 07:55:29 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use)
Sep 04 07:55:29 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Sep 04 07:55:29 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:5281 failed (98: Address already in use)
Sep 04 07:55:30 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:443 failed (98: Address already in use)
Sep 04 07:55:30 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
Sep 04 07:55:30 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] bind() to 0.0.0.0:5281 failed (98: Address already in use)
Sep 04 07:55:30 vpsXXXXXX.ovh.net nginx[29335]: nginx: [emerg] still could not bind()
Sep 04 07:55:30 vpsXXXXXX.ovh.net systemd[1]: nginx.service: Control process exited, code=exited status=1
Sep 04 07:55:30 vpsXXXXXX.ovh.net systemd[1]: Failed to start A high performance web server and a reverse proxy server.
Sep 04 07:55:30 vpsXXXXXX.ovh.net systemd[1]: nginx.service: Unit entered failed state.
Sep 04 07:55:30 vpsXXXXXX.ovh.net systemd[1]: nginx.service: Failed with result 'exit-code'.

1h30 later I realize my web server is down, so I just systemctl restart nginx and everything works just fine.

Sep 04 09:23:48 vpsXXXXXX.ovh.net systemd[1]: Starting A high performance web server and a reverse proxy server...
Sep 04 09:23:48 vpsXXXXXX.ovh.net nginx[30003]: nginx: [warn] could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_siz
Sep 04 09:23:48 vpsXXXXXX.ovh.net nginx[30004]: nginx: [warn] could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_siz
Sep 04 09:23:48 vpsXXXXXX.ovh.net systemd[1]: nginx.service: Failed to read PID from file /run/nginx.pid: Invalid argument
Sep 04 09:23:48 vpsXXXXXX.ovh.net systemd[1]: Started A high performance web server and a reverse proxy server.

Unfortunately I didn't record last time it happened and the logs are apparently too old:

$ zgrep "bind()" /var/log/nginx/*
(just this morning's episode)

... but I'm sure I had a similar issue at least 2 or 3 times this year (which is acceptable but annoying).

It does not seem to be related by a server restart:

$ uptime
 09:43:37 up 12 days,  7:00,  1 user,  load average: 0.00, 0.00, 0.00

This is my root user crontab, I don't think it is related as the schedule do not seem to match, but as I cannot think of anywhere to look for, so here it is anyway:

# let's encrypt renewal
0 5 1 * * certbot renew --authenticator standalone --installer  nginx --pre-hook "systemctl stop nginx" --post-hook "systemctl start nginx" -n
# automatically update debian
0 5 * * 1 apt -qq update && apt dist-upgrade -qq -y
# couldn't find another way to make sure that datetime.today() actually returns today's date in my flask apps
0 0 * * * systemctl reload uwsgi.service

What is causing this?

EDIT: In /var/log/nginx/error.log, I can find a little more info:

2018/09/04 07:55:19 [warn] 29269#29269: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/04 07:55:19 [info] 29269#29269: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/04 07:55:20 [warn] 29271#29271: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/04 07:55:20 [info] 29271#29271: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/04 07:55:20 [warn] 29273#29273: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/04 07:55:20 [info] 29273#29273: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/04 07:55:26 [warn] 29305#29305: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/04 07:55:26 [notice] 29305#29305: signal process started
2018/09/04 07:55:26 [error] 29305#29305: open() "/run/nginx.pid" failed (2: No such file or directory)
2018/09/04 07:55:26 [warn] 29306#29306: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/04 07:55:27 [warn] 29333#29333: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
(then the same lines journalctl displays)

Why would that open() "/run/nginx.pid" failed (2: No such file or directory) happen all of a sudden?

EDIT2: It happened again this morning:

$ cat /var/log/nginx/error.log
2018/09/13 10:12:19 [warn] 7230#7230: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:19 [info] 7230#7230: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/13 10:12:31 [warn] 7243#7243: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:31 [notice] 7243#7243: signal process started
2018/09/13 10:12:31 [error] 7243#7243: open() "/run/nginx.pid" failed (2: No such file or directory)
2018/09/13 10:12:31 [warn] 7244#7244: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:32 [warn] 7247#7247: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:32 [info] 7247#7247: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/13 10:12:32 [warn] 7251#7251: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:32 [info] 7251#7251: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/13 10:12:33 [warn] 7255#7255: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:33 [info] 7255#7255: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/13 10:12:33 [warn] 7256#7256: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:443 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:80 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:5281 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:443 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:80 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:5281 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:443 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:80 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:5281 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:443 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:80 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:5281 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:443 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:80 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: bind() to 0.0.0.0:5281 failed (98: Address already in use)
2018/09/13 10:12:33 [emerg] 7256#7256: still could not bind()
2018/09/13 10:12:36 [alert] 7245#7245: unlink() "/run/nginx.pid" failed (2: No such file or directory)
-------- this is when I realized my web services are down and manually restart nginx
2018/09/13 11:25:11 [warn] 7578#7578: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size
2018/09/13 11:25:11 [info] 7578#7578: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:66
2018/09/13 11:25:12 [warn] 7579#7579: could not build optimal proxy_headers_hash, you should increase either proxy_headers_hash_max_size: 512 or proxy_headers_hash_bucket_size: 64; ignoring proxy_headers_hash_bucket_size

I think I will just add a cronjob that checks every 5 min if nginx is up and start it if it isn't but this is so dirty it makes me want to puke.

Your logs are too few to record the incident from _this morning_? — Lenniey, Sep 04 '18 at 08:42
might be occurring with an update of systemd? Compare with your dpkg.log — danblack, Sep 04 '18 at 08:51
@Lenniey sorry if I wasn't clear, I was talking about the previous incident. I added some context that `journalctl` did not include. @danblack Apparently there was no update this morning according to `dpkg.log`. — nicoco, Sep 04 '18 at 11:08

Why does nginx automatically restart (and fail to do so)

0 Answers0