0

It is Window 2003 server.

We are running some performance test, and what we see is: 1. In first 5 hours, the page fault/sec is very small, like 10 or 20

  1. In the last 1 hour, the page fault jumps to 500 page fault/sec

  2. In the last 1 hour, we see the java server will stop logging anything for 6-7 seconds, and then resume back. This happened about 200 times in the last 1 hour.

  3. We suspect it is because of JVM Garbage Collection.

What I want to know is that when JVM is doing GC, is it expected to see a big amount of page fault/sec compared to no GC?

performanceuser
  • 147
  • 1
  • 2
  • 6

1 Answers1

1

Yes.

When you do garbage collection, you tend to access a large number of pages that haven't been accessed recently. Many of these pages were probably considered candidates for eviction should the system encounter memory pressure.

The first such access to each such page (or group of pages managed as a unit) requires the OS to remove them from the set of eviction candidates and instead consider them recently-accessed. This requires a soft page fault to give the OS a chance to change the accounting.

Lots of pages accessed that haven't been accessed recently means lots of soft page faults.

David Schwartz
  • 31,449
  • 2
  • 55
  • 84
  • What if we have plenty of RAM, like 64G, and the whole system is using only 20G. Are we suppose to see big page fault/sec? I don't think so since everything can stay at RAM. – performanceuser Apr 01 '13 at 22:45
  • Yes, you are. Otherwise, how would the OS know to keep the pages in RAM? (The page faults are what tells the OS that the page is being accessed. Without a page fault, the OS has no idea what memory is being accessed.) – David Schwartz Apr 01 '13 at 22:53
  • But if all the memory are in the RAM, when the application tries to access a virtual address, is it suppose to be mapped to a physical address without a page fault? – performanceuser Apr 02 '13 at 14:36
  • @performanceuser: You mean without a *hard* page fault. Hard page faults are not the only kinds of page faults. How do you think the OS decides *what* data to keep in RAM? Without page faults, it would have no idea that the data was being accessed. (The page faults invoke the OS's page handler which marks that the pages have been accessed so it knows to keep them in RAM. Otherwise, it would have no idea which pages were being accessed, and thus should be kept in RAM, and which weren't, and thus shouldn't.) – David Schwartz Apr 02 '13 at 18:20
  • So you agree that there should not be any hard page faults, right? In my case, OS should have enough RAM to keep everything in physical memory, so there is no hard page fault. Then I don't understand during GC, why there are lots of other kinds of page fault happen? During GC, it just try to access all the object that has been created by application. Why there is a soft page fault then? – performanceuser Apr 04 '13 at 21:10
  • There are soft page faults because without them, the operating system would have no way of knowing the memory was in use and thus no way to know it should keep the pages in memory. Operating systems use soft page faults to keep track of what memory is accessed. Since the pages haven't been accessed in a while, the operating system sets them to trigger a page fault when they're accessed so the OS will know they have been accessed. The first access to each such page (or collection of pages managed as a unit) triggers a page fault. A page fault means the OS did something when memory was accessed. – David Schwartz Apr 04 '13 at 21:20
  • From you explain of soft page fault, it seems that every access of memory will trigger a soft page fault, which doesn't make sense to me. From my understanding soft page fault happens when the page is in the physical memory but not in the working set of the current process, is this correct? If yes, I want to know when a page will be removed from a working set? If I have a Map or Set to cache items, will they be removed from working set if we don't access them for a long time? – performanceuser Apr 05 '13 at 18:56
  • What I'm talking about has little to do with whether a page is in the working set or not. It's about whether the process has recently accessed the page. In order to manage things like page eviction and working sets, the OS has to know which pages a process has recently accessed. To determine this, it periodically arranges it so that those accesses trigger a soft page fault. The fault handler marks the page accessed and changes it so that accesses won't trigger a soft page fault until later when it needs to know again. Removing pages from the working set is much later in the process. – David Schwartz Apr 05 '13 at 19:07
  • I don't quite agree with your definition of soft page fault. From wiki: If the page is loaded in memory at the time the fault is generated, but is not marked in the memory management unit as being loaded in memory, then it is called a minor or soft page fault. It is quite different from what you described here. – performanceuser Apr 05 '13 at 19:22
  • That is also called a soft page fault. A "page fault" is a fault triggered by accessing a page. A "hard page fault" (or "major page fault") is a page fault that requires slow I/O (disk, network, etcetera). A "soft page fault" (or "minor page fault") is a page fault that does not. – David Schwartz Apr 05 '13 at 19:30
  • My point is that if the page is in memory and in my working set, I don't think a page fault will be trigger. – performanceuser Apr 05 '13 at 19:32
  • let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/8209/discussion-between-david-schwartz-and-performanceuser) – David Schwartz Apr 05 '13 at 19:41
  • I read the ppt, I understand the pages will be trimmed. But OS only trim the pages in the working set, agree? – performanceuser Apr 05 '13 at 20:12
  • The definition of soft page fault is: Soft page faults may also occur when the page is in a transitional state because it has been removed from the working sets of the processes that were using it, or it is resident as the result of a prefetch operation. Based on this I think trimming won't trigger a soft page fault, because trimming only happens to the pages in the working set. When a page is removed from working set, it become available to other processes – performanceuser Apr 05 '13 at 20:12
  • You're missing that pages *in the working set* are aged. To age pages, you must know that they weren't accessed. To know that they weren't accessed, you must know that they were accessed. To know that they were accessed, there must be a fault. The only way to do this is to periodically arrange for access to pages in the working set to trigger a page fault. – David Schwartz Apr 05 '13 at 20:13
  • I still don't believe it. Can you post a link that explain when aging a page require a soft fault? Your theory contradicts with the definition of soft page fault – performanceuser Apr 05 '13 at 20:15
  • Here is another definition of soft page fault: Soft page faults may also occur when the page is in a transitional state because it has been removed from the working sets of the processes that were using it, or it is resident as the result of a prefetch operation. – performanceuser Apr 05 '13 at 20:15
  • The definition of a [soft page fault](http://en.wikipedia.org/wiki/Page_fault#Minor) is a page fault that doesn't require slow I/O. None of the things you cite are "definition" of a soft page fault, they're just describing some of the circumstances under which a soft page fault can occur. Aging a page requires a soft page fault if the page hasn't been accessed recently. The page ager sets pages to trigger a page fault if it needs to know when they are accessed. (There is no conceivable way Windows 7 could age the working set without soft page faults. Think about it.) – David Schwartz Apr 05 '13 at 20:44
  • (If this is the explanation, they will be counted as transition faults.) – David Schwartz Apr 05 '13 at 20:49
  • OK. Assuming aging requires soft page fault. Then go back to your original answer. You said: "Many of these pages were probably considered candidates for eviction should the system encounter memory pressure." Can you tell me if these pages are still in the working set? – performanceuser Apr 05 '13 at 21:17
  • In other words, what kind of value shows that the page is considered candidates for eviction. My understand is that if a page is not in any process's working set, it is. I am not sure if this is correct. Is there any other cases that a page is considered candidates for eviction? – performanceuser Apr 05 '13 at 21:18
  • @performanceuser: It's a matter of your definition of "working set". You can answer either "yes" or "no" if you want. The pages are in memory. They are somewhat recently accessed but not very recently accessed. They are rigged for a transition page fault if accessed by any process. You are welcome to consider them part of the working set (of processes that have mapped them) or not as you wish. Windows can count them towards some processes that have them mapped and not others depending on whether any working sets have been trimmed. – David Schwartz Apr 05 '13 at 21:19
  • Take it to [chat] if you need to discuss this further. Comments are not meant to be a discussion forum. – Chris S Apr 05 '13 at 21:24