I have just setup a ubuntu 12.04.2 LTS server that serves a big number of quite large static files. The configuration is the same as on another machine which works great. The other machine uses Ubuntu 11.10 with nginx 1.0.5 . The machine with the problem uses nginx 1.1.19 and it can hardly push around 20MB/s (but is on a 1Gbit dedicated line) with the iotop showing high disk IO by nginx. This is from iotop:
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
4569 be/4 www-data 754.61 K/s 0.00 B/s 0.00 % 99.99 % nginx: worker process
4571 be/4 www-data 1257.69 K/s 0.00 B/s 0.00 % 99.99 % nginx: worker process
4574 be/4 www-data 2.46 M/s 0.00 B/s 0.00 % 99.99 % nginx: worker process
3951 be/4 www-data 1760.77 K/s 0.00 B/s 0.00 % 99.99 % nginx: worker process is shutting down
3950 be/4 www-data 503.08 K/s 0.00 B/s 0.00 % 99.99 % nginx: worker process is shutting down
4573 be/4 www-data 2012.31 K/s 0.00 B/s 0.00 % 99.99 % nginx: worker process
3952 be/4 www-data 1006.15 K/s 0.00 B/s 0.00 % 99.99 % nginx: worker process is shutting down
3954 be/4 www-data 1760.77 K/s 0.00 B/s 0.00 % 99.99 % nginx: worker process is shutting down
4572 be/4 www-data 4.05 M/s 0.00 B/s 0.00 % 99.99 % nginx: worker process
3956 be/4 www-data 2.70 M/s 0.00 B/s 0.00 % 99.99 % nginx: worker process is shutting down
3953 be/4 www-data 251.54 K/s 0.00 B/s 0.00 % 99.99 % nginx: worker process is shutting down
4567 be/4 www-data 2.21 M/s 0.00 B/s 0.00 % 98.30 % nginx: worker process
4570 be/4 www-data 754.61 K/s 0.00 B/s 0.00 % 97.91 % nginx: worker process
3949 be/4 www-data 1006.15 K/s 0.00 B/s 0.00 % 88.21 % nginx: worker process is shutting down
3955 be/4 www-data 1509.23 K/s 0.00 B/s 0.00 % 84.60 % nginx: worker process is shutting down
So for some reason those processes that try to shutdown cause the IO and the server goes in a almost non-responsive state, with the load growing as high as 5-6 (this is a dual core machine). The CPU utilisation meanwhile is aroun 0.5%
After restarting nginx everything is fine for some time and then this happens again.
This is the latest from the error log of nginx:
013/03/18 13:09:28 [alert] 3676#0: open socket #297 left in connection 145
and then this happens:
2013/03/18 13:10:11 [alert] 3749#0: 100 worker_connections are not enough
and this is the nginx.conf:
user www-data;
worker_processes 8;
worker_rlimit_nofile 20480;
pid /var/run/nginx.pid;
events {
worker_connections 100;
# multi_accept on;
}
http {
##
# Basic Settings
##
sendfile off;
output_buffers 1 512k;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 5;
types_hash_max_size 2048;
Any help will be highly appreciated!
EDIT:
Sendfile
on and off makes no difference.
worker_rlimit_nofile == worker_connections
makes no difference.
worker_processes
changes nothing also.
smartctl
shows no problems with the disk, however I tried with the second disk on this machine and still no difference.