2

A production web service that runs WCF on a Windows Server 2003 machine was non-responsive. I captured a memory dump file for analysis in DebugDiag 1.2.

DebugDiag revealed that a thread encountered an OutOfMemoryException. Normally, we try to trap all exceptions, print them to the event log, and then return them as a WCF fault.

This OutOfMemoryException was somehow not caught and resulted in a C++ error. I accept that the .NET runtime was unable to handle this error. When you're out of memory, you're out of memory.

The analysis revealed that the that one of the threads was attempting to display the Visual C++ runtime error:

Microsoft Visual C++ Runtime Library Runtime Error! Program: C:\windows\system32\inetsrv".

The stack trace looked like this:

ntdll!KiFastSystemCallRet    
ntdll!ZwRaiseHardError+c    
user32!ServiceMessageBox+145    
user32!MessageBoxWorker+13e    
user32!MessageBoxTimeoutW+7a    
user32!MessageBoxTimeoutA+9c    
user32!MessageBoxExA+1b    
user32!MessageBoxA+45    
msvcr71!__crtMessageBoxA+f4   f:\vs70builds\3052\vc\crtbld\crt\src\crtmbox.c @ 118 + 10 
msvcr71!_NMSG_WRITE+12e  f:\vs70builds\3052\vc\crtbld\crt\src\crt0msg.c @ 240 + 10 
msvcr71!abort+7   f:\vs70builds\3052\vc\crtbld\crt\src\abort.c @ 48 
kernel32!UnhandledExceptionFilter+12a    
kernel32!BaseThreadStart+4a    
kernel32!_except_handler3+61    
ntdll!ExecuteHandler2+26    
ntdll!ExecuteHandler+24    
ntdll!RtlRaiseException+3d    
kernel32!RaiseException+53    
msvcr80!_CxxThrowException+46    
mscorwks!ThrowOutOfMemory+24 

The documentation of abort says:

In a single or multithreaded Windows-based application, abort calls the Windows MessageBox function to create a message box to display the message with an OK button. When the user clicks OK, the program aborts immediately. The message can be suppressed by calling _set_abort_behavior with the appropriate arguments.

This message box hung the server. One thread had triggered a GC, but this thread had disabled preemptive GC. Most of the rest of the threads where blocking while waiting for the GC to complete.

How do I disable the Visual C++ Runtime Error dialog for a web server?

-- EDIT --

The OutOfMemoryException was thrown in one thread that was processing a large DataSet. Once this was thrown, there was a cross-context exception thrown. This resulted in the following stack trace in WinDbg:

1c54eee8 78158e89 e06d7363 00000001 00000003 kernel32!RaiseException+0x53 (FPO: [Non-Fpo])
1c54ef20 7a14fd18 1c54ef30 7a37d92c 7a3c4aa8 **msvcr80**!_CxxThrowException+0x46 (FPO: [Non-Fpo])
1c54ef34 7a1082db f74a69a8 79f38888 1c54f108 mscorwks!ThrowOutOfMemory+0x24 (FPO: [Non-Fpo])
1c54f060 7a10a245 00000000 1c54f098 1c54f108 mscorwks!Thread::RaiseCrossContextException+0x408 (FPO: [Non-Fpo])
1c54f114 79fd882b 00000002 79fd87f6 1c54f20c mscorwks!Thread::DoADCallBack+0x2a2 (FPO: [Non-Fpo])
1c54f130 79e9846b 1c54f20c 1c54f1b8 79f7762b mscorwks!Thread::DoADCallBack+0x310 (FPO: [Non-Fpo])
1c54f1c4 79e98391 1c54f20c f74a6bc8 23e78e00 mscorwks!Thread::ShouldChangeAbortToUnload+0xe3 (FPO: [Non-Fpo])
1c54f200 79e9851d 1c54f20c 00000002 00000000 mscorwks!Thread::ShouldChangeAbortToUnload+0x30a (FPO: [Non-Fpo])
1c54f228 79fd8f6c 00000002 7a0b68a2 1c54f264 mscorwks!Thread::ShouldChangeAbortToUnload+0x33e (FPO: [Non-Fpo])
1c54f240 7a0b6b5b 00000002 7a0b68a2 1c54f264 mscorwks!ManagedThreadBase::ThreadPool+0x13 (FPO: [Non-Fpo])
1c54f294 7a0b6b8d 00000000 00000000 04a47fe0 mscorwks!BindIoCompletionCallbackStubEx+0x95 (FPO: [Non-Fpo])
1c54f2ac 79f3e605 00000000 00000000 04a47fe0 mscorwks!BindIoCompletionCallbackStub+0x13 (FPO: [Non-Fpo])
1c54f314 79f920a5 00000000 00000000 7ffdc000 mscorwks!ThreadpoolMgr::CompletionPortThreadStart+0x430 (FPO: [Non-Fpo])
1c54ffb8 77e64829 23e29018 00000000 00000000 mscorwks!Thread::intermediateThreadProc+0x49 (FPO: [Non-Fpo])
1c54ffec 00000000 79f9205f 23e29018 00000000 kernel32!BaseThreadStart+0x34 (FPO: [Non-Fpo])

And that exception resulted in a call to the runtime to display this message:

1c54e340 7c82775b 773d7a4b 50000018 00000004 ntdll!KiFastSystemCallRet
1c54e344 773d7a4b 50000018 00000004 00000003 ntdll!NtRaiseHardError+0xc
1c54e3a0 773b8377 1d05ff70 1d052e48 00012010 user32!ServiceMessageBox+0x145
1c54e4fc 7739eec9 1c54e508 00000028 00000000 user32!MessageBoxWorker+0x13e
1c54e554 773d7d0d 00000000 1d05ff70 1d052e48 user32!MessageBoxTimeoutW+0x7a
1c54e588 773c42c8 00000000 1c54e62c 7c37f480 user32!MessageBoxTimeoutA+0x9c
1c54e5a8 773c42a4 00000000 1c54e62c 7c37f480 user32!MessageBoxExA+0x1b
1c54e5c4 7c34c224 00000000 1c54e62c 7c37f480 user32!MessageBoxA+0x45
1c54e5f8 7c348e6c 1c54e62c 7c37f480 00212010 msvcr71!__crtMessageBoxA+0xf4 [f:\vs70builds\3052\vc\crtbld\crt\src\crtmbox.c @ 118]
1c54e81c 7c34cf83 0000000a 00000000 1c54ead4 msvcr71!_NMSG_WRITE+0x12e [f:\vs70builds\3052\vc\crtbld\crt\src\crt0msg.c @ 240]
1c54e854 77e761b7 1c54ead4 00000000 00000000 msvcr71!abort+0x7 [f:\vs70builds\3052\vc\crtbld\crt\src\abort.c @ 48]
1c54eaac 77e792a3 1c54ead4 77e61ac1 1c54eadc kernel32!UnhandledExceptionFilter+0x12a
1c54eab4 77e61ac1 1c54eadc 00000000 1c54eadc kernel32!BaseThreadStart+0x4a
1c54eadc 7c828752 1c54ee98 1c54ffdc 1c54ebb8 kernel32!_except_handler3+0x61
1c54eb00 7c828723 1c54ee98 1c54ffdc 1c54ebb8 ntdll!ExecuteHandler2+0x26
1c54eba8 7c82863c 1c549000 1c54ebb8 00010007 ntdll!ExecuteHandler+0x24
1c54ee88 77e4bee7 1c54ee98 00000002 e06d7363 ntdll!RtlRaiseException+0x3d
1c54eee8 78158e89 e06d7363 00000001 00000003 kernel32!RaiseException+0x53

I'm not sure why there are 2 versions of the Visual Studio Runtime Library on the stack. That is unusual. I see no evidence of other 3rd party DLLs on the stack.

Community
  • 1
  • 1
Paul Williams
  • 16,585
  • 5
  • 47
  • 82
  • Is that the top of the stack? Who is calling `mscorwks!ThrowOutOfMemory`? It sounds like maybe you're calling some code that is not meant to be used in a service application. – John Saunders Nov 07 '12 at 23:06
  • No, that is the call stack after the OutOfMemoryException generated a C++ runtime exception. The source of the OutOfMemoryException was several very large DataSets loaded into memory. That is a separate issue. I want to prevent web service hangs from errors like this. – Paul Williams Nov 07 '12 at 23:16
  • Do you mean that there was a .NET System.OutOfMemoryException? Why would that generate a C++ runtime exception? Is this a C++ application? – John Saunders Nov 08 '12 at 02:50
  • This is a .NET WCF application that was trying to load a large DataSet into memory and return it to the client. (We do use WCF streaming, but that was not used here for some reason.) The DataSet caused a .NET OutOfMemoryException to be thrown. Somehow this OutOfMemoryException was thrown back as a C++ error. The process caught this with the UnhandledExceptionFilter which called abort(), and abort tried to popup the C++ runtime error message box. – Paul Williams Nov 08 '12 at 16:54
  • If I were you, I'd find out where that C++ exception came from. Why is the C++ runtime even present? – John Saunders Nov 08 '12 at 18:23

1 Answers1

2

You have to invoke the Win32 set abort behavior in your assembly. You can use P/Invoke to do this.

Here is an interesting article about a similar issue. http://blogs.msdn.com/b/pfedev/archive/2010/08/25/whodunit-who-threw-the-message-box-and-why.aspx

They are suggesting this is a known issue that has not been addressed, thus the suggestion to turn it off with _set_abort_behavior.

Andrew T Finnell
  • 13,417
  • 3
  • 33
  • 49
  • 1
    I believe this is the correct answer. I'm not exactly sure where to put the call to this-- maybe in a static constructor. I also found that we can override the system behavior for crashes like this: http://support.microsoft.com/kb/124873. I recommended to the customer they set ErrorMode = 2 on their web server now. – Paul Williams Nov 09 '12 at 17:57
  • @PaulWilliams Were you able to confirm that the registry key setting worked on all your cases? That is a good find. – Andrew T Finnell Nov 09 '12 at 18:56
  • In my case the registry key setting didn't work. Our w3wp process still hangs in the way described by the question. – Shane Neuville Jul 13 '16 at 19:15