I have a high-traffic webserver (up to ~50 page loads per second) running nginx, which passes HTML requests to a PHP-FPM backend and serves static assets directly.
Under this load, I get a curiously high amount of disk I/O from the nginx worker processes. My pages serve a lot of static images that get changed every few hours, so a lot of disk reads are expected, but I actually see more writing than reading, with a persistent writing rate on the order of 10 MB/s by the nginx processes (as given by iotop).
With my docker containers active for 12 days, I get this I/O history for one of the nginx processes:
$ cat /proc/nginx-pid/io
rchar: 34537119315778
wchar: 27295224195419
syscr: 2217567839
syscw: 2285538495
read_bytes: 1499124252672
write_bytes: 7338930909184
cancelled_write_bytes: 141058945024
That's 1.5TB read over 12 days, which makes sense given my static assets and traffic, but 7.3TB written to disk by nginx seems insane to me.
Here's the output of time strace -p $nginx_pid -c
over a period of 60 seconds:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
33.05 0.840451 5 178693 2446 write
16.52 0.420117 4 104554 pread64
10.09 0.256603 6 39772 1 readv
9.24 0.234970 11 20994 pwritev
6.60 0.167708 16 10700 188 open
4.92 0.125204 3 37249 epoll_wait
4.47 0.113708 3 41601 20572 read
3.70 0.094091 23 4154 4154 connect
3.16 0.080408 5 15515 close
1.77 0.045090 11 4260 writev
1.42 0.036188 9 4139 rename
0.92 0.023423 6 4139 chmod
0.68 0.017227 2 10515 fstat
0.57 0.014555 3 5131 epoll_ctl
0.53 0.013573 3 4154 socket
0.53 0.013569 3 4903 recvfrom
0.35 0.008942 2 5892 getpid
0.31 0.007828 2 4156 getsockopt
0.28 0.007092 2 3725 getsockname
0.28 0.007003 2 4154 ioctl
0.24 0.006143 5 1258 1 stat
0.19 0.004889 4 1324 670 accept4
0.12 0.003023 4 690 pwrite64
0.04 0.001125 2 631 setsockopt
------ ----------- ----------- --------- --------- ----------------
100.00 2.542930 512303 28032 total
2.55user 7.57system 1:00.56elapsed 16%CPU (0avgtext+0avgdata 840maxresident)k
0inputs+0outputs (0major+273minor)pagefaults 0swaps
Some relevant nginx config:
worker_processes 8;
worker_rlimit_nofile 20000;
worker_connections 5000;
multi_accept on;
use epoll;
access_log off;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
open_file_cache max=10000 inactive=60s;
open_file_cache_valid 5s;
open_file_cache_min_uses 6;
fastcgi_buffers 8 16k;
I'm not seeing why I should expect the nginx process to write so much, especially with the access log disabled. Running lsof -p nginx-pid
shows that most files opened by the process are sockets and images that are being served. Only the sockets are open with write permissions, so I guess all that writing must be through those?
My disks are mounted with the relatime
option, so I'm not expecting disk writes on every read of those images. The FastCGI buffer size should be able to handle page response sizes of 128K, and that should be large enough for my purposes. There are no log warnings about temporary files being used.
I do have a FastCGI cache setup:
fastcgi_cache_path /shm/nginx/fastcgi levels=1:2 keys_zone=microcache:10m max_size=10m inactive=1m;
fastcgi_cache_key $scheme$request_method$host$request_uri;
fastcgi_cache microcache;
fastcgi_cache_valid 200 5s;
fastcgi_cache_lock on;
fastcgi_cache_lock_timeout 5s;
fastcgi_cache_lock_age 5s;
Could all this writing be writes to that? It seems like quite a lot.
What am I missing? Is this a normal amount of I/O (presumably writes to sockets) for an active webserver?