- Context:
- Server is a CentOS 5.2 x86_64 virtual machine with vmxnet3 ifaces, running on VSphere 4.1 on a Nehalem-based server (which is at half cpu and memory capacity according to the VCenter) with a 10 Gb network. Almost nil i/o on the virtual scsi disk of the VM according to iostat.
- Reads videos from an Isilon cluster, using NFS (atime is disabled)
- Serves them using lighttpd 1.5.0, which sits at 20% cpu. Around 650 HTTP connections, including 550 established, with an average of 100 Kb in Send-Q.
As we are loading the server with more request, cpu wait and irq are increasing. Memory isn't the problem.
Cpu0 : 0.0%us, 3.0%sy, 0.0%ni, 18.0%id, 0.0%wa, 32.0%hi, 47.0%si, 0.0%st
Cpu1 : 3.0%us, 4.0%sy, 0.0%ni, 55.4%id, 34.7%wa, 0.0%hi, 3.0%si, 0.0%st
4163 irq/s on the interface used by HTTP, and 2269 irq/s on the one for NFS, according to /proc/interrupts. For respectively 180 Mbps and 130 Mbps according to iptraf.
iostat for the NFS mount:
rBlk_nor/s wBlk_nor/s rBlk_dir/s wBlk_dir/s rBlk_svr/s wBlk_svr/s rops/s wops/s
63737.87 0.00 0.00 0.00 61364.71 0.00 1098.04 1107.84
Hey, wops ? But no setattr and such on /proc/self/mountstats:
opts: ro,vers=3,rsize=32768,wsize=32768,acregmin=1200,acregmax=1200,acdirmin=1200,acdirmax=1200,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys
age: 2405948
caps: caps=0x1,wtmult=8192,dtsize=4096,bsize=0,namelen=255
sec: flavor=1,pseudoflavor=1
events: 3496282 32148506 1 1697 3176945 2598729 37924190 0 33339443 67286271 0 0 20 0 0 0 0 3176406 0 0 0 0 0 0 0
bytes: 31773968205376 0 0 0 31969360034250 0 7805430344 0
RPC iostats version: 1.0 p/v: 100003/3 (nfs)
xprt: tcp 779 0 50 250 0 1014646219 1014646203 0 8377876491 11916594888
per-op statistics
NULL: 0 0 0 0 0 0 0 0
GETATTR: 3496282 3496282 0 461510280 391583584 2165765 2594488 5332330
SETATTR: 0 0 0 0 0 0 0 0
LOOKUP: 2598882 2598882 0 374792176 623714816 3558569 79355750 83640121
ACCESS: 2824036 2824036 0 384066672 338884320 1788232 2276978 4482334
READLINK: 0 0 0 0 0 0 0 0
READ: 1005726981 1005726982 0 144824685416 32098094238420 7454826308 4671373832 13644100410
WRITE: 0 0 0 0 0 0 0 0
CREATE: 0 0 0 0 0 0 0 0
MKDIR: 0 0 0 0 0 0 0 0
SYMLINK: 0 0 0 0 0 0 0 0
MKNOD: 0 0 0 0 0 0 0 0
REMOVE: 0 0 0 0 0 0 0 0
RMDIR: 0 0 0 0 0 0 0 0
RENAME: 0 0 0 0 0 0 0 0
LINK: 0 0 0 0 0 0 0 0
READDIR: 0 0 0 0 0 0 0 0
READDIRPLUS: 13 13 0 2132 23788 60 1240 1300
FSSTAT: 2 2 0 256 336 0 0 0
FSINFO: 1 1 0 128 164 0 10 10
PATHCONF: 0 0 0 0 0 0 0 0
COMMIT: 0 0 0 0 0 0 0 0
- How to tell if the HTTP side or the NFS is the problem with the iowait and irq cpu usage ? Or how to tell if the VSphere host is reaching its I/O limits ?