Virtual Windows Servers and Pagefile location

Question

Considering that Windows makes heavy use of the pagefile even with huge amounts of RAM available, is it not best to have this pagefile on the fastest disk possible as close to the virtual systems as possible?

I'm thinking, RAM disk.

Where I work, storage for VMs is out on a NAS/SAN. I'm worried that so much memory access is having to go across the network!

As a side, I think its about time MS got rid of paging and told us to buy more DIMMS.

UPDATE

Accessing a local spindle is C40,000 times slower than a DIMM, so going over the network will be even slower for hard faults. I don't know why I got the downvote, I'm certain that this is an issue unless there's some other mechanism in ESX/HyperV that manages this.

FINAL UPDATE

Some useful links and advice I found, related to this below.

From Netapp's forums:

Host the virtual machine (VM) vswap and temp/page file on a separate NFS data store on a different volume on the NetApp storage system. (Separation of transient data allows faster completion of NetApp Snapshot copies and achieves higher storage efficiency.)

From thebitsthatbyte.com:

Hyper-V VM Performance Tip: Move the Page File (pagefile.sys) to Another Virtual Disk

From VMinstall.com:

Set up a data store in each ESXi cluster for guest SWAP files, see diagram to the right. (Do not use local ESX disks! This will save more expensive SAN storage but causes latency for DRS and vMotioning of VMs.)

From troubleshooting ESX 4 advice, at pubs.vmware.com:

For resource-intensive virtual machines, separate the virtual machine's physical disk drive from the drive with the system page file. This alleviates disk spindle contention during periods of high use.

I didn't downvote... yet.. but can tell you that you got downvoted because this a bad "question" for our Q&A, objective-based format, and is predicated on incorrect and completely confused assertions. It strikes me as a rambling rant/not a real question/trainwreck-of-thought, to be completely blunt about it. I'm actually surprised it doesn't have a bunch of votes to close already. But, seeing as how I'm procrastinating about doing real work right now anyway, I threw down an answer of sorts to where I think you're getting the fundamentals wrong to arrive at this... "question." — HopelessN00b, Sep 13 '12 at 11:35
I don't normally explain my downvotes but as you clearly don't understand how we do things around here I'll make an exception. You get a -1 from me for your ill considered "logic" and lack of understanding. If I could I'd downvote again for your argumentative attitude and comments. — John Gardeniers, Sep 13 '12 at 12:22
John, please explain your downvotes in future. You wouldn't smack your kid's wrist and walk off. You'd praise the good and when they're bad, you'd explain how and why. I had to update the question before someone took the time to explained why. That's not good for this forum. — Luke Puplett, Sep 13 '12 at 12:57
-1 from me, I trust Linus, Bill, and Steve much more than I trust your logic. Not to mention your brash responses to those who provided explanations and answers to your "question" — Brent Pabst, Sep 13 '12 at 13:50
The first response was by Chopper3 in a provocative, incredulous tone, with capitals and dismissing my idea as 'nuts'. At no point have I been rude or aggressive and yet I'm the bad guy. Close the question please someone. — Luke Puplett, Sep 13 '12 at 14:30
@LukePuplett Good call on providing those links (eventually), had you done so in the first place, your question would probably have a few upvotes instead of 6 downvotes. That said, those links are applicable to specific situations for the specific technologies referenced, and not applicable to the general question you posed, which is about general issues (that don't exist, FWIW) created by Windows VMs swapping when their OS disks are served from a SAN. — HopelessN00b, Sep 16 '12 at 19:05
@HopelessN00b - So should we ask for a Best Practices Wiki for Windows Servers and Hyper V (bit broad) or post a specific Server scenario (too narrow?) for an answer? — Alex S, Mar 06 '17 at 11:45

HopelessN00b · Answer 1 · 2012-09-16T19:13:48.303

OK, I think you're misunderstanding or mis-estimating a few basic concepts here, so in the order I see them in your question, here we go:

(1) Your assertion that Windows makes heavy use of the pagefile even with ample RAM is incorrect. It doesn't.

The pagefile may stay the same size, but whether it's actually utilized to store anything or not it a whole other matter. Allocation (how much stuff is in it) is also very different from utilization (how heavily/frequently it's used), and just because Windows loads a large amount of memory contents to the page file does not mean they will be heavily used, or even used at all. Virtual memory, or the Windows page file, conceptually is just a place to put data ("stuff") that's not needed urgently enough or frequently enough to be stored in RAM, but is needed more urgently or more frequently than an arbitrary data chunk elsewhere on the hard disk (some random file).

(2) Your idea to solve this non-existent problem with a [very large] RAM disk won't work.

You could cram a petabyte RAM disk in there, and Windows (and operating systems in general) will still utilize any virtual memory space you allow them for any data it determines ought to be stored in virtual memory, because virtual memory, at a conceptual level is "just" somewhere faster to access than the normal disk, but slower to access than RAM. Since RAM has historically been much more expensive and scarce than disk space, this was (and even still is) a great compromise the keep the costs of computing down - if the data isn't frequently accessed, or expected to be accessed soon, it doesn't need to be on high cost, extremely fast access chips (RAM), so you can put it somewhere cheaper [and thus slower] (a "hard disk") to allow prioritized data to occupy that finite resource.

(3) You are worrying about something totally insignificant to SAN performance.

A SAN is a system of networking and storage that is specifically designed for the express purpose of transferring large amounts of data from computer systems that access them to the disks in your SAN's storage controller. (There are other functions as well, of course, but that's the main one.) There are people whose sole job, and primary function in life is to design these systems to do that. They have put a lot more thought, study scientific methodology and higher math into doing so than any of the rest of us ever will. They do know what they're doing, and they're far better equipped to fuss over those nitty-gritty details than you or I.
Perhaps even more to the point, swapping (transferring to/from virtual memory) memory contents is going to be a minute portion of your SAN bandwidth, unless you have a whole lot of machines doing obscene volumes of swapping, which would be noticeable, in that this level of swapping would render the machine unusable. So unless that's the case in your environment, the relative volume of data transfer to and from the SAN as a result of swapping is insignificant. Your SAN will see the greatest portion of traffic (by a vast, vast margin) from activities more normally associated with accessing the hard disk. These would be things like moving files, saving or changing data, reading and updating the databases, and so on. If there's actually a need to optimize traffic to and from your SAN (and why do you think that - are your system response times actually degraded due to a SAN bandwidth or throughput bottleneck?), there are a very long list of far more important considerations than how much the Windows servers are swapping virtual memory.

(4) Regarding your statement about how Microsoft should get rid of paging...

You mean to say "virtual memory" and not "paging," because paging is not a actually synonym for virtual memory, but is actually a memory management scheme which includes the use of virtual memory.
In either instance, that's not going to happen, won't help... anything... anyway, and is largely irrelevant, because even if Microsoft got rid of virtual memory, most of the world's servers, appliances, and computing devices in general do not run on Microsoft-based operating systems. The majority run *nix based operating system, and other even older operating systems that you'll be even less familiar with than UNIX or Linux.

EDIT:

In response to your comment (@LukePuplett), I'll add the below to my answer as well.

OK, since it seems, after all that, you're still not convinced, and still think Windows paging is degrading SAN performance, let me suggest that you "do the math" on what impact Windows swapping is having on your SAN I/O. You'd do so by:

Monitoring (via perfmon or whathave you) on your Windows servers (or at least a decent sample thereof) to determine/estimate how many hard page faults you have. This would be the number of swap events that go to your SAN. Probably will be in, and best to get as, a per second rate. Call this [A].
Same monitoring with different counters to determine the average size of your page events. Probably will be in kiloBytes, convert to Bytes (multiply by 1000). Call this [B].
Compute your theoretical maximum throughput bottleneck of your SAN. This is probably (and certainly most easily to compute) the network speed rating of your SAN switches/fiber, for example, an 8 Gb/s fibre-channel SAN would (or could) have a max theoretical throughput value of 8Gb/s. Since this is in bit/s, (in this example) and your other values would be in Bytes/s, divide the value by 8. Call this [C].

Take ([A]*[B]*2)/[C] to get the percentage of maximum throughput which your paging data is consuming. (Multiply by 2, as the hard faults only measure data transfer in one direction, and there will usually be, at least in theory, a corresponding data transfer in the other direction too. An instance of moving data from RAM into virtual memory should usually correspond to moving disk-stored data into RAM as well.)

Unless your environment is highly abnormal, this value will the same order of magnitude as one tenth of one percent (0.1%) of your SAN's maximum throughput, based on a my metrics from dozens of different environments over the span of about a decade. As mentioned above, in the original answer text, there are a very long list of other factors (literally a couple dozen discrete configs or factors I could rattle out off the top of my head) that you'd be able get more than a 0.1% "performance" improvement out of your SAN by tuning, if your SAN would actually benefit from performance tuning at all, which is something you've provided absolutely no evidence to suggest.

Firstly, thanks for taking the time to answer and I take your points on train-wreck thinking. However, my question has merit. While a RAM disk setup by the host Windows VM won't work, it may do if setup and made available by the host OS - but let's leave that. My main concern is that the hundreds of hard faults/min are going over the network to a contended store, competing mostly with my own data IO as you pointed out. While there may be brilliant minds working on SANs, there are also brilliant sales people. -- amidst all this, pfo has managed to provide a humble and proper response. — Luke Puplett, Sep 13 '12 at 12:15
"While a RAM disk setup by the host Windows VM won't work, it may do if setup and made available by the host OS" - not sure that makes sense sorry — Chopper3, Sep 13 '12 at 12:25
+1 Windows also requires a pagefile to be present on the boot volume in order to be able to create memory dumps when a system crash occurs. — Ansgar Wiechers, Sep 13 '12 at 12:29
@LukePuplett You are, of course, free to do dismiss my answer and comment. To your continued concern regarding hard page faults, I advise you to do the math. How many hard page faults do you see a minute going to/from your SAN? Multiply that number by the size of the average hard-faulting memory page (in bits), and divide by 60 (to convert from per minute to per second). Compare with the data throughput of your SAN at the bottleneck (probably network, which will be in Gigabits/s), and you're almost certainly looking at less than even one tenth of one percent of your SAN's maximum throughput. — HopelessN00b, Sep 13 '12 at 12:59
@HopelessN00b I'm sorry about all this. I'm not worried about the SAN. I'm concerned that the constant 'chatter' with the paging file I see when monitoring my VMs is slower than it was ever intended to be since it goes over the wire, not just a local disk array. My VM disk queue times are up. Let's leave it. Sorry to have wasted your time. — Luke Puplett, Sep 13 '12 at 15:09
@LukePuplett No biggie. I'd just suggest that a) swapping waits until it either has to happen urgently, or it can happen without performance impact, so you won't have performance impact and b) I'm the most senior sysadmin where I work, and we have a somewhat large VMWare environment with multiple blade centers stuffed full of blades, hosting multiple Windows VMs each, all having every bit of their hard disks on a SAN. We don't have any SAN issues for traffic from swapping. I'm not really trying to argue with you, just to reassure you about it, because it's definitely not a creating problem. — HopelessN00b, Sep 14 '12 at 05:14

score 5 · Answer 2 · edited Sep 14 '12 at 15:51

5

While Windows DOES indeed use the pagefile for more than JUST memory overcommit scenarios I think this is nuts, just give the VM the memory it needs and deal with the storage performance issues separately - what you're suggesting would be a mistake.

Oh and pagefiles aren't going anywhere soon.

edited Sep 14 '12 at 15:51

Robert Cartaino

788
1
5
18

answered Sep 13 '12 at 10:43

Chopper3

101,299
9
108
239

Windows uses the pagefile all the time, regardless of RAM pressure. So yes, I believe I putting this in the fastest place possible is desirable. – Luke Puplett Sep 13 '12 at 10:55
1

I said that, but lowering your usable memory to make way for a RAM disk to store data that can't fit in memory isn't right, just add more memory and fix the disk performance issue rather than make things worse. – Chopper3 Sep 13 '12 at 10:56
The pagefile provides a soft-landing for runaway memory usage, but its actually used much more of the time, in practice. It's usually C1.5x physical RAM. So, I think, the OS should just enforce that it can only use 40% of physical RAM, and it'll start screaming when you're at this point. We need no pagefile. People will just (correctly) buy the right amount of RAM for the job. – Luke Puplett Sep 13 '12 at 11:13
7

ok, you know best. – Chopper3 Sep 13 '12 at 11:14

score 3 · Accepted Answer · answered Sep 13 '12 at 10:42

You can place your VM's swap area (the one the hypervisor uses to swap out vRAM to) to a faster LUN/datastore that doesn't use thin provision and is backed by the fastest spindles/NAND your array has and leave the pagefile within the OS's supervision and on it's OS installation partition. Monitor the swap area usage and pagefile usage and buy more vRAM and DIMMs as needed. Anything else is over-doing it.

Also getting rid of paging to disk is not going to happen anytime soon as the concept of virtual memory is more than essential to any halfway decent operating system and hardware platform.