I've been asked to look at an internal application written in C++ and running on Linux thats having some difficulties. Periodically it will have a large amount of major page faults (~200k), which cause the wall clock run time to increase by x10+, then on some runs it will have none.
I've tried isolating different pieces of the code but am struggling to repeat the page fault errors when testing it.
Does anyone have any suggestions for getting any more information out of the application/Linux on major page faults? All I have really is a total.