What could cause Nginx to write this much I/O?

Question

I have a high-traffic webserver (up to ~50 page loads per second) running nginx, which passes HTML requests to a PHP-FPM backend and serves static assets directly.

Under this load, I get a curiously high amount of disk I/O from the nginx worker processes. My pages serve a lot of static images that get changed every few hours, so a lot of disk reads are expected, but I actually see more writing than reading, with a persistent writing rate on the order of 10 MB/s by the nginx processes (as given by iotop).

With my docker containers active for 12 days, I get this I/O history for one of the nginx processes:

$ cat /proc/nginx-pid/io
rchar: 34537119315778
wchar: 27295224195419
syscr: 2217567839
syscw: 2285538495
read_bytes: 1499124252672
write_bytes: 7338930909184
cancelled_write_bytes: 141058945024

That's 1.5TB read over 12 days, which makes sense given my static assets and traffic, but 7.3TB written to disk by nginx seems insane to me.

Here's the output of time strace -p $nginx_pid -c over a period of 60 seconds:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 33.05    0.840451           5    178693      2446 write
 16.52    0.420117           4    104554           pread64
 10.09    0.256603           6     39772         1 readv
  9.24    0.234970          11     20994           pwritev
  6.60    0.167708          16     10700       188 open
  4.92    0.125204           3     37249           epoll_wait
  4.47    0.113708           3     41601     20572 read
  3.70    0.094091          23      4154      4154 connect
  3.16    0.080408           5     15515           close
  1.77    0.045090          11      4260           writev
  1.42    0.036188           9      4139           rename
  0.92    0.023423           6      4139           chmod
  0.68    0.017227           2     10515           fstat
  0.57    0.014555           3      5131           epoll_ctl
  0.53    0.013573           3      4154           socket
  0.53    0.013569           3      4903           recvfrom
  0.35    0.008942           2      5892           getpid
  0.31    0.007828           2      4156           getsockopt
  0.28    0.007092           2      3725           getsockname
  0.28    0.007003           2      4154           ioctl
  0.24    0.006143           5      1258         1 stat
  0.19    0.004889           4      1324       670 accept4
  0.12    0.003023           4       690           pwrite64
  0.04    0.001125           2       631           setsockopt
------ ----------- ----------- --------- --------- ----------------
100.00    2.542930                512303     28032 total
2.55user 7.57system 1:00.56elapsed 16%CPU (0avgtext+0avgdata 840maxresident)k
0inputs+0outputs (0major+273minor)pagefaults 0swaps

Some relevant nginx config:

worker_processes  8;
worker_rlimit_nofile 20000;
worker_connections  5000;
multi_accept on;
use epoll;
access_log off;
sendfile    on;
tcp_nopush  on;
tcp_nodelay on;
open_file_cache max=10000 inactive=60s;
open_file_cache_valid 5s;
open_file_cache_min_uses 6;
fastcgi_buffers 8 16k;

I'm not seeing why I should expect the nginx process to write so much, especially with the access log disabled. Running lsof -p nginx-pid shows that most files opened by the process are sockets and images that are being served. Only the sockets are open with write permissions, so I guess all that writing must be through those?

My disks are mounted with the relatime option, so I'm not expecting disk writes on every read of those images. The FastCGI buffer size should be able to handle page response sizes of 128K, and that should be large enough for my purposes. There are no log warnings about temporary files being used.

I do have a FastCGI cache setup:

fastcgi_cache_path /shm/nginx/fastcgi levels=1:2 keys_zone=microcache:10m max_size=10m inactive=1m;
fastcgi_cache_key $scheme$request_method$host$request_uri;
fastcgi_cache microcache;
fastcgi_cache_valid 200 5s;
fastcgi_cache_lock on;
fastcgi_cache_lock_timeout 5s;
fastcgi_cache_lock_age 5s;

Could all this writing be writes to that? It seems like quite a lot.

What am I missing? Is this a normal amount of I/O (presumably writes to sockets) for an active webserver?

That can be made by nginx cache if you have enable it, maybe nginx rewrite cache and generate this problem, and im not sure but all data passed by nginx to php-fpm socket maybe considered as bytes writen to disk. Also as you known all data read by nginx also generate writes when system update the access time to each file. — Skamasle, Jan 14 '19 at 23:02
Well again, my disks are mounted with relatime, which only performs a write on the first access after files are modified. Also, I have a hard time believing the HTTP request data passed to the PHP-FPM backend would sum to the huge amounts I'm seeing here. — Manwe Sulimo, Jan 15 '19 at 01:24

What could cause Nginx to write this much I/O?

0 Answers0