IRP_MJ_WRITE latency up to 15 seconds

Question

We have written an application that performs small (22kB) writes to multiple files at once (one thread performing asynchronous queued writes to multiple locations on behalf of other threads) on the same local volume (RAID1).
99.9% of the writes are low-latency but occasionally (maybe every minute or two) we get one or two huge latency writes (I have seen 10 seconds and above) without any real explanation.

Platform: Win2003 Server with NTFS.
Monitoring: Sysinternals Process Monitor (see link below) and our own application logging.

We have tried multiple things to try and solve this that have been gleaned from a few websites, e.g.:

Making the first part of file names unique to aid 8.3 name generation
Writing files to multiple directories
Changing Intel Disk Write Caching
Windows File/Printer Sharing
- Minimize memory used
- Balance
- Maximize data throughput for file sharing
- Maximize data throughput for network applications
System->Advanced->Performance->Advanced
NtfsDisableLastAccessUpdate - use fsutil behavior set disablelastaccess 1
disable 8.3 name generation - use "fsutil behavior set disable8dot3 1" + restart
Enable a large size file system cache
Disable paging of the kernel code
IO Page Lock Limit
Turn Off (or On) the Indexing Service

But nothing seems to make much difference. There's a whole host of things we haven't tried yet but we wondered if anyone had come across the same problem, a reason and a solution (programmatic or not)?

We can reproduce the problem using IOMeter and a simple setup:

Start IOMeter and remove all but the first worker thread in 'Topology' using the disconnect button.
Select the Worker thread and put a cross in the box next to the disk you want to use in the Disk Targets tab and put '2000000' in Maximum Disk Size (NOTE: must have at least 1GB free space; sector size is 512 bytes)
Next create a new access specification and add it to the worker thread:
- Transfer Request Size = 22kB
- 100% Sequential
- Percent of Access Spec = 100%
- Percent Read/Write = 100% Write
Change Results Display Update Frequency to 5 seconds, Test Setup Run Time to 20 seconds and both 'Number of Workers to Spawn Automatically' settings to zero.
Select the Worker Thread in the Topology panel and hit the Duplicate Worker button 59 times to create 60 threads with identical settings.

Hit the 'Go' button (green flag) and monitor the Results tab. The 'Maximum I/O Response Time (ms)' always hits at least 3500 on our machine. Our machine isn't exactly slow (Xeon 8 core rack server with 4GB and onboard RAID).

I'd be interested to see what other people get. We have a feeling it might be something to do with the NTFS filesystem (ours is currently 75% full of fragmented files) and we are going to try a few things around this principle. But it is also related to disk performance since we don't see it on a RAMDisk and it's not as severe on a RAID10 array.

Any help is much appreciated.
Richard

Right-click and select 'Open Link in New Tab':
ProcMon Result

score 1 · Answer 1 · answered Feb 07 '10 at 23:01

I now have more information on this issue.

Having tested the described IOMeter test on 12 different machines using a variety of hardware I have narrowed it down to a specific RAID chipset (3 different machines with the same chipset exhibit this behaviour using RAID1 and RAID10). All other machines have a result at least an order of magnitude better.

Chipset: Intel 631xESB/632xESB SATA RAID (aka ESB2)

See this post on the Intel site for more information and hopefully a response from Intel:
Intel 631xESB/632xESB SATA RAID (aka ESB2) writes slow

Richard

IRP_MJ_WRITE latency up to 15 seconds

1 Answers1