Copying Files Gives Disk Lag

Question

I have some Game Server services running on my Debian 6.0 64-bit Custom Kernel machine.

I lag on my currently running services whenever I install a new service.

Installing a new service just involves extracting the files to a user's directory although I do it at the lowest priority ( nice 19 and 1000kbps SCP limit ).

This time I took a "vmstat 3" output during the time I was installing new service and when the lag occurred. I'm not good at reading it good though but I guess these are some disk issues.

http://paste.ubuntu.com/1152249/

See line numbers 11-16 for the time when it happened.

UPDATE :

Here is

df -h

Output: http://paste.ubuntu.com/1152734/

And the 250G disk is the one in use with those operations and services.

What scares me are the 4k context switch BEFORE file transfer occurs; also looks like you have 2 cpu/cores and one is 100% used when you dont do anything. Could you provide output of top (type c and P), ps aux, cat /proc/cpuinfo. I'd look for a problem while you dont transfer files. — , Aug 17 '12 at 14:56
Install [Nmon](http://nmon.sourceforge.net/pmwiki.php). Have it displaying (or logging) CPU(c), Disk (d), Top (t) while you do the copy. Are your disks going up to or anywear near 100% utilisation in normal operation without the copy? with the copy? — Matt, Aug 17 '12 at 15:03
I wonder if this has something to do with your custom kernel, because that many context switches per sec is absolutely not normal during simple file copy. Did you use Debian kernel config but a more recent kernel, or did you fiddle around with kernel settings? Changed HZ values? Enabled some experimental features? — Janne Pikkarainen, Aug 17 '12 at 15:05
Okay yes I have 2 cores. The overall CPU usage is around 35-45% normally. Here are the outputs: top ( c ) http://paste.ubuntu.com/1152898/ - top ( P ) http://paste.ubuntu.com/1152878 - ps aux http://paste.ubuntu.com/1152901/ - cat /proc/cpuinfo http://paste.ubuntu.com/1152891/ I will also try to install Nmon and check if it goes near 100%. About Kernel, I use a 3.2.6 with RT patch, nothing experimental but something I've always used on other systems and nothing like this happened. — Asad Moeen, Aug 17 '12 at 16:23

score 3 · Accepted Answer · answered Aug 17 '12 at 14:53

I'd suggest changing the I/O scheduler on the system from its default and retesting the file copy. Depending on your kernel version, you may actually have the deadline scheduler set as the default. Maybe this is a case where the cfq scheduler would make more sense.

Check your current setting with:

cat /sys/block/<device>/queue/scheduler

Where is the data drive's block device (e.g. sda, sdb, etc.)
Depending on what you have set currently, you can change it with:

echo deadline > /sys/block/<device>/queue/scheduler

or

echo cfq > /sys/block/<device>/queue/scheduler

Test your file copy...

You can also do this globally by appending elevator=deadline or elevator=cfq to the GRUB kernel boot line and rebooting.

score 2 · Answer 2 · answered Aug 17 '12 at 15:49

There are a few things you can do. The IO Scheduler is the lowest hanging fruit overall and does not require any major reconfiguration.

A few other things to consider would be to increase the block size of your file system. The default is 4K for the ext series which for most cases is suitable. However it is important to know a little more about your underlying storage. For instance if your disks are in a RAID array you may find it beneficial to have your block sizes equal the stripe size. If you are using a standard disk check to see if it is using 4KB sectors (see /sys/block//queue/hw_sector_size). Newer disks will use 4K sectors. If you are using 4K sectors you may wish to do some research to ensure your partitions are sector alined correctly for maximum performance. Most Linux distros account for this today however. Increasing the block size of a file system can allow for larger chunks of data to be grouped together on the disk resulting in fewer seek operations. However larger block sizes can result in disk space diminishing faster for lots of files requiring a small number of blocks.

The ext filesystem also has some options with regards to IO write/read barriers, it may be beneficial to have a look at some of these as well. In addition if you have multiple disks you may wish to consider putting the journal on another disk. This will reduce the IO load on the disk resulting in greater throughput.

score 0 · Answer 3 · answered Aug 17 '12 at 15:30

0

Use ionice. Idle priority will help you.

ionice -c 3 <command>

answered Aug 17 '12 at 15:30

Stone

7,011
1
21
33

Note: `ionice` only works with the [CFQ scheduler](http://en.wikipedia.org/wiki/CFQ). – ewwhite Aug 17 '12 at 15:34

Copying Files Gives Disk Lag

3 Answers3