There are so many threads on whether you should mess with the page file or not. This scenario describes a unique circumstance that is real world in my production enviornment. The conclusion I've come to in order to fix my problem is to disable the page file.
I'm running a series of guest VMs all of which Server 2003 Enterprise Edition (inorite?). For my physical hosts, I'm running HP DL380 G7's loaded with VMware's ESXi 5.0 (managed via vCenter). For storage I have an HP P2000 G3 SAS array loaded with 16 300 GB 10k SAS Drives in RAID 6, call it LUN01. These virtual servers make up our Wonderware environment with a single SQL server and Historian, two application servers, and two terminal servers.
The work that this stack performs is mission critical, and determines whether the facility can serve its function or not. (i.e., when server goes down, the business goes down) Recently, several disk failures in the P2000 array caused me to rethink the architecture from the ground up. Reconstructing disks in the array severely hurt the performance to the point where the wonderware app became completely unresponsive. Since these VMs all run I/O intensive applications, and RAID reconstruction places such a high demand on a RAID.
I've determined that the bottleneck during disk reconstruction occurs because of application server disk writes. Seemingly because its using the system page file instead of RAM. The amount of network I/O thus becomes directly linked to disk I/O. Consequently severe performance impact on the disks during reconstruction directly impacts APP server I/O. It makes very little sense why its designed this way, but it perfectly explains why a server that stores nothing locally (an app server) would sustain 10Mbps disk write rate (vmware performance statistics for the app server VM).
So... what I'm thinking is given the circumstances I want to disable the page file in the guest OS (server 2003 EE) to prevent the deployed wonderware app engine from creating such high disk I/O demands... and as a result lessen the impact of future disk reconstructions in the RAID.
- What do you think?
- Does this justify disabling the page file?
- Am I overlooking another solution to minimize the performance impact of raid reconstruction?