A production web service that runs WCF on a Windows Server 2003 machine was non-responsive. I captured a memory dump file for analysis in DebugDiag 1.2.
DebugDiag revealed that a thread encountered an OutOfMemoryException. Normally, we try to trap all exceptions, print them to the event log, and then return them as a WCF fault.
This OutOfMemoryException was somehow not caught and resulted in a C++ error. I accept that the .NET runtime was unable to handle this error. When you're out of memory, you're out of memory.
The analysis revealed that the that one of the threads was attempting to display the Visual C++ runtime error:
Microsoft Visual C++ Runtime Library Runtime Error! Program: C:\windows\system32\inetsrv".
The stack trace looked like this:
ntdll!KiFastSystemCallRet
ntdll!ZwRaiseHardError+c
user32!ServiceMessageBox+145
user32!MessageBoxWorker+13e
user32!MessageBoxTimeoutW+7a
user32!MessageBoxTimeoutA+9c
user32!MessageBoxExA+1b
user32!MessageBoxA+45
msvcr71!__crtMessageBoxA+f4 f:\vs70builds\3052\vc\crtbld\crt\src\crtmbox.c @ 118 + 10
msvcr71!_NMSG_WRITE+12e f:\vs70builds\3052\vc\crtbld\crt\src\crt0msg.c @ 240 + 10
msvcr71!abort+7 f:\vs70builds\3052\vc\crtbld\crt\src\abort.c @ 48
kernel32!UnhandledExceptionFilter+12a
kernel32!BaseThreadStart+4a
kernel32!_except_handler3+61
ntdll!ExecuteHandler2+26
ntdll!ExecuteHandler+24
ntdll!RtlRaiseException+3d
kernel32!RaiseException+53
msvcr80!_CxxThrowException+46
mscorwks!ThrowOutOfMemory+24
The documentation of abort says:
In a single or multithreaded Windows-based application, abort calls the Windows MessageBox function to create a message box to display the message with an OK button. When the user clicks OK, the program aborts immediately. The message can be suppressed by calling _set_abort_behavior with the appropriate arguments.
This message box hung the server. One thread had triggered a GC, but this thread had disabled preemptive GC. Most of the rest of the threads where blocking while waiting for the GC to complete.
How do I disable the Visual C++ Runtime Error dialog for a web server?
-- EDIT --
The OutOfMemoryException was thrown in one thread that was processing a large DataSet. Once this was thrown, there was a cross-context exception thrown. This resulted in the following stack trace in WinDbg:
1c54eee8 78158e89 e06d7363 00000001 00000003 kernel32!RaiseException+0x53 (FPO: [Non-Fpo])
1c54ef20 7a14fd18 1c54ef30 7a37d92c 7a3c4aa8 **msvcr80**!_CxxThrowException+0x46 (FPO: [Non-Fpo])
1c54ef34 7a1082db f74a69a8 79f38888 1c54f108 mscorwks!ThrowOutOfMemory+0x24 (FPO: [Non-Fpo])
1c54f060 7a10a245 00000000 1c54f098 1c54f108 mscorwks!Thread::RaiseCrossContextException+0x408 (FPO: [Non-Fpo])
1c54f114 79fd882b 00000002 79fd87f6 1c54f20c mscorwks!Thread::DoADCallBack+0x2a2 (FPO: [Non-Fpo])
1c54f130 79e9846b 1c54f20c 1c54f1b8 79f7762b mscorwks!Thread::DoADCallBack+0x310 (FPO: [Non-Fpo])
1c54f1c4 79e98391 1c54f20c f74a6bc8 23e78e00 mscorwks!Thread::ShouldChangeAbortToUnload+0xe3 (FPO: [Non-Fpo])
1c54f200 79e9851d 1c54f20c 00000002 00000000 mscorwks!Thread::ShouldChangeAbortToUnload+0x30a (FPO: [Non-Fpo])
1c54f228 79fd8f6c 00000002 7a0b68a2 1c54f264 mscorwks!Thread::ShouldChangeAbortToUnload+0x33e (FPO: [Non-Fpo])
1c54f240 7a0b6b5b 00000002 7a0b68a2 1c54f264 mscorwks!ManagedThreadBase::ThreadPool+0x13 (FPO: [Non-Fpo])
1c54f294 7a0b6b8d 00000000 00000000 04a47fe0 mscorwks!BindIoCompletionCallbackStubEx+0x95 (FPO: [Non-Fpo])
1c54f2ac 79f3e605 00000000 00000000 04a47fe0 mscorwks!BindIoCompletionCallbackStub+0x13 (FPO: [Non-Fpo])
1c54f314 79f920a5 00000000 00000000 7ffdc000 mscorwks!ThreadpoolMgr::CompletionPortThreadStart+0x430 (FPO: [Non-Fpo])
1c54ffb8 77e64829 23e29018 00000000 00000000 mscorwks!Thread::intermediateThreadProc+0x49 (FPO: [Non-Fpo])
1c54ffec 00000000 79f9205f 23e29018 00000000 kernel32!BaseThreadStart+0x34 (FPO: [Non-Fpo])
And that exception resulted in a call to the runtime to display this message:
1c54e340 7c82775b 773d7a4b 50000018 00000004 ntdll!KiFastSystemCallRet
1c54e344 773d7a4b 50000018 00000004 00000003 ntdll!NtRaiseHardError+0xc
1c54e3a0 773b8377 1d05ff70 1d052e48 00012010 user32!ServiceMessageBox+0x145
1c54e4fc 7739eec9 1c54e508 00000028 00000000 user32!MessageBoxWorker+0x13e
1c54e554 773d7d0d 00000000 1d05ff70 1d052e48 user32!MessageBoxTimeoutW+0x7a
1c54e588 773c42c8 00000000 1c54e62c 7c37f480 user32!MessageBoxTimeoutA+0x9c
1c54e5a8 773c42a4 00000000 1c54e62c 7c37f480 user32!MessageBoxExA+0x1b
1c54e5c4 7c34c224 00000000 1c54e62c 7c37f480 user32!MessageBoxA+0x45
1c54e5f8 7c348e6c 1c54e62c 7c37f480 00212010 msvcr71!__crtMessageBoxA+0xf4 [f:\vs70builds\3052\vc\crtbld\crt\src\crtmbox.c @ 118]
1c54e81c 7c34cf83 0000000a 00000000 1c54ead4 msvcr71!_NMSG_WRITE+0x12e [f:\vs70builds\3052\vc\crtbld\crt\src\crt0msg.c @ 240]
1c54e854 77e761b7 1c54ead4 00000000 00000000 msvcr71!abort+0x7 [f:\vs70builds\3052\vc\crtbld\crt\src\abort.c @ 48]
1c54eaac 77e792a3 1c54ead4 77e61ac1 1c54eadc kernel32!UnhandledExceptionFilter+0x12a
1c54eab4 77e61ac1 1c54eadc 00000000 1c54eadc kernel32!BaseThreadStart+0x4a
1c54eadc 7c828752 1c54ee98 1c54ffdc 1c54ebb8 kernel32!_except_handler3+0x61
1c54eb00 7c828723 1c54ee98 1c54ffdc 1c54ebb8 ntdll!ExecuteHandler2+0x26
1c54eba8 7c82863c 1c549000 1c54ebb8 00010007 ntdll!ExecuteHandler+0x24
1c54ee88 77e4bee7 1c54ee98 00000002 e06d7363 ntdll!RtlRaiseException+0x3d
1c54eee8 78158e89 e06d7363 00000001 00000003 kernel32!RaiseException+0x53
I'm not sure why there are 2 versions of the Visual Studio Runtime Library on the stack. That is unusual. I see no evidence of other 3rd party DLLs on the stack.