2

We have a C# Windows service that runs a process that is split into multiple tasks. Most of the tasks contact a web service using WCF to perform work against a database. The service's tasks are run in multiple threads.

A customer opened a support case with us saying the Windows service was occasionally failing to respond and needed to be restarted. I acquired a memory dump from the Windows service. I ran DebugDiag 2.0 to analyze the dump file.

The DebugDiag report had an interesting entry in the summary:

The following threads in WindowsService.DMP are attempting to make an HttpWebRequest, however they do not appear to be waiting on the remote server to respond (eg. not 'on the wire'). One or more of these requests are using at least half of its maximum number of available connections.

( 17 18 27 31 32 33 42 ) 12.07% of threads blocked (7 threads)

If many threads are in this state, it is often an indication that a throttling limit (i.e. the 'maxconnection' setting) has been exhausted. Click on any thread in the list to the left to review the throttling details for the WebRequest on which it is waiting.

If necessary, you can increase the number of connections available by either modifying the 'maxconnection' parameter in the application configuration file (see <connectionManagement> Element), or by modifying the appropriate ConnectionLimit property programmatically (see Managing Connections).

I jumped to thread 17 and saw this:

Thread 17 - System ID 4612

Entry point mscorwks!Thread::intermediateThreadProc Create time 9/10/2015 10:13:14 AM Time spent in user mode 0 Days 00:00:00.000 Time spent in kernel mode 0 Days 00:00:00.000

This thread is attempting to make an HttpWebRequest, however they do not appear to be waiting on the remote server to respond (eg. not 'on the wire'). One or more of these requests are using at least half of its maximum number of available connections.

Warning, at least half of the availabe connections are being used

HttpRequest URI:http://WebServer/MyWebSite/SubDir/MyService.svc ServicePoint - ConnectionLimit:48 CurrentConnections:44

The HttpWebRequest object is a loopback address but the connection limit still applies to this webrequest object because the connection limit is defined (either through autoconfig set to true in the processModel section or by adding a * entry inside connectionManagement section

The .NET Call Stack is below:

Function

[[HelperMethodFrame_1OBJ] (System.Threading.WaitHandle.WaitOneNative)] System.Threading.WaitHandle.WaitOneNative(Microsoft.Win32.SafeHandles.SafeWaitHandle, UInt32, Boolean, Boolean) 
mscorlib_ni!System.Threading.WaitHandle.WaitOne(Int64, Boolean)+2f 
mscorlib_ni!System.Threading.WaitHandle.WaitOne(Int32, Boolean)+25 
System_ni!System.Net.LazyAsyncResult.WaitForCompletion(Boolean)+d3 
System_ni!System.Net.Connection.SubmitRequest(System.Net.HttpWebRequest)+2b7 
System_ni!System.Net.ServicePoint.SubmitRequest(System.Net.HttpWebRequest, System.String)+7c 
System_ni!System.Net.HttpWebRequest.SubmitRequest(System.Net.ServicePoint)+f9 
System_ni!System.Net.HttpWebRequest.GetRequestStream(System.Net.TransportContext ByRef)+1d3 
System_ni!System.Net.HttpWebRequest.GetRequestStream()+e 
System_ServiceModel_ni!System.ServiceModel.Channels.HttpOutput+WebRequestHttpOutput.GetOutputStream()+45 
System_ServiceModel_ni!System.ServiceModel.Channels.HttpOutput.Send(System.TimeSpan)+f6 
System_ServiceModel_ni!System.ServiceModel.Channels.HttpChannelFactory+HttpRequestChannel+HttpChannelRequest.SendRequest(System.ServiceModel.Channels.Message, System.TimeSpan)+121 
System_ServiceModel_ni!System.ServiceModel.Channels.RequestChannel.Request(System.ServiceModel.Channels.Message, System.TimeSpan)+cb 
System_ServiceModel_ni!System.ServiceModel.Dispatcher.RequestChannelBinder.Request(System.ServiceModel.Channels.Message, System.TimeSpan)+17 
System_ServiceModel_ni!System.ServiceModel.Channels.ServiceChannel.Call(System.String, Boolean, System.ServiceModel.Dispatcher.ProxyOperationRuntime, System.Object[], System.Object[], System.TimeSpan)+1a2 
System_ServiceModel_ni!System.ServiceModel.Channels.ServiceChannel.Call(System.String, Boolean, System.ServiceModel.Dispatcher.ProxyOperationRuntime, System.Object[], System.Object[])+33 
System_ServiceModel_ni!System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(System.Runtime.Remoting.Messaging.IMethodCallMessage, System.ServiceModel.Dispatcher.ProxyOperationRuntime)+43 
System_ServiceModel_ni!System.ServiceModel.Channels.ServiceChannelProxy.Invoke(System.Runtime.Remoting.Messaging.IMessage)+65 
mscorlib_ni!System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(System.Runtime.Remoting.Proxies.MessageData ByRef, Int32)+bd
[[TPMethodFrame] (IMyWebService.GetDataSet)]
IMyWebService.GetDataSet(System.Guid, System.String, System.Data.DataSet)
<service code snipped>
mscorlib_ni!System.Threading.ThreadHelper.ThreadStart_Context(System.Object)+66
mscorlib_ni!System.Threading.ExecutionContext.runTryCode(System.Object)+51 
[[HelperMethodFrame_PROTECTOBJ] (System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup)]
System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup(TryCode, CleanupCode, System.Object) 
mscorlib_ni!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)+67 
mscorlib_ni!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)+45 
mscorlib_ni!System.Threading.ThreadHelper.ThreadStart()+44 
[[GCFrame]] 

I see the recommendations that it made and have started researching them. My question is:

Why is DebugDiag saying the thread does not appear to be waiting for the server to respond?

Looking at the .NET Reference Source, the request appears to have been submitted successfully, and the service appears to be waiting for a response.

UPDATE

Breaking into a normal call, I do see a call stack waiting at ws2_2 as suggested by Puneet Gupta below:

ntdll.dll!_NtWaitForSingleObject@12()
mswsock.dll!_SockWaitForSingleObject@16()
mswsock.dll!_WSPRecv@36()
***ws2_32.dll!_recv@16()***
System.ni.dll!6c084a13()
[Managed to Native Transition]  
System.dll!System.Net.Sockets.Socket.Receive(byte[] buffer, int offset, int size, System.Net.Sockets.SocketFlags socketFlags, out System.Net.Sockets.SocketError errorCode)
System.dll!System.Net.Sockets.Socket.Receive(byte[] buffer, int offset, int size, System.Net.Sockets.SocketFlags socketFlags)
System.dll!System.Net.Sockets.NetworkStream.Read(byte[] buffer, int offset, int size)
System.dll!System.Net.PooledStream.Read(byte[] buffer, int offset, int size)
System.dll!System.Net.Connection.SyncRead(System.Net.HttpWebRequest request, bool userRetrievedStream, bool probeRead)
System.dll!System.Net.ConnectStream.ProcessWriteCallDone(System.Net.ConnectionReturnResult returnResult)
System.dll!System.Net.HttpWebRequest.WriteCallDone(System.Net.ConnectStream stream, System.Net.ConnectionReturnResult returnResult)
System.dll!System.Net.ConnectStream.CallDone(System.Net.ConnectionReturnResult returnResult)
System.dll!System.Net.ConnectStream.ResubmitWrite(System.Net.ConnectStream oldStream, bool suppressWrite)
System.dll!System.Net.HttpWebRequest.EndWriteHeaders_Part2()
System.dll!System.Net.HttpWebRequest.EndWriteHeaders(bool async)
System.dll!System.Net.HttpWebRequest.WriteHeadersCallback(System.Net.WebExceptionStatus errorStatus, System.Net.ConnectStream stream, bool async)
System.dll!System.Net.ConnectStream.WriteHeaders(bool async)
System.dll!System.Net.HttpWebRequest.EndSubmitRequest()
System.dll!System.Net.HttpWebRequest.CheckDeferredCallDone(System.Net.ConnectStream stream)
System.dll!System.Net.HttpWebRequest.GetResponse()
System.ServiceModel.dll!System.ServiceModel.Channels.HttpChannelFactory<System.ServiceModel.Channels.IRequestChannel>.HttpRequestChannel.HttpChannelRequest.WaitForReply(System.TimeSpan timeout)
System.ServiceModel.dll!System.ServiceModel.Channels.RequestChannel.Request(System.ServiceModel.Channels.Message message, System.TimeSpan timeout)
System.ServiceModel.dll!System.ServiceModel.Dispatcher.RequestChannelBinder.Request(System.ServiceModel.Channels.Message message, System.TimeSpan timeout)  Unknown
System.ServiceModel.dll!System.ServiceModel.Channels.ServiceChannel.Call(string action, bool oneway, System.ServiceModel.Dispatcher.ProxyOperationRuntime operation, object[] ins, object[] outs, System.TimeSpan timeout)
System.ServiceModel.dll!System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(System.Runtime.Remoting.Messaging.IMethodCallMessage methodCall, System.ServiceModel.Dispatcher.ProxyOperationRuntime operation)
System.ServiceModel.dll!System.ServiceModel.Channels.ServiceChannelProxy.Invoke(System.Runtime.Remoting.Messaging.IMessage message)
mscorlib.dll!System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(ref System.Runtime.Remoting.Proxies.MessageData msgData, int type)

So normally, it will be waiting for a response from the Windows socket. In this case, the threads are probably waiting for connections to become available to process the request-- as shown by the other DebugDiag messages.

Paul Williams
  • 16,585
  • 5
  • 47
  • 82

2 Answers2

1

The reason why the message says that it doesn't seem to be waiting on the wire is because the last frame on the thread is waithandle.waitone.

For a thread which is really waiting on the wire we should see ws2_32 (in the native stack) which is the windows socket library or some functions from system.net.sockets in the managed stack.

Were you able to capture multiple dumps ? If yes, did you see that thread state changed in one dump vs. the second one ?

Puneet Gupta
  • 2,237
  • 13
  • 17
  • Interesting. No, I have only one dump, but I can take > 1 in the future if it happens again. My best guess using WinDbg is that the threads appear to have been waiting an overly long time-- around 1 to 2 hours. – Paul Williams Sep 17 '15 at 21:36
1

Does the !dso (!DumpStackObjects) show any connection object in current thread? If so, what is m_WaitList value in System.Net.Connection object and what is value of m_CurrentRequest? Since the your callstack does NOT show we are waiting on ws2_32 (WinSock), it is an indication that HWR is still waiting for getting either an available connection or a socket.

hugh
  • 11
  • 2
  • Thank you for responding. There is a connection. The connection's wait list has 1 HTTP request to the same URL that the current thread is attempting to contact. (I do expect multiple threads to be calling the same endpoint for different purposes.) On this connection, there is no current request. `m_CurrentRequest` is null. – Paul Williams Sep 18 '15 at 19:53
  • how about m_Free/m_Idle Boolean in connection. Since this connection is backlogged as 1, but has no owner request. m_Free and m_Idle will tell you the connection status. – hugh Sep 18 '15 at 21:52
  • Did you test to set ServicePointManager.DefaultConnecitonLimit as 96 before creating HWR? – hugh Sep 18 '15 at 22:01
  • m_Free and m_Idle are both 0. m_ReadDone = 0, m_WriteDone = 1. The ConnectionGroup has a maximum of 48 connections. There are currently 45 connections in this group. The first 7 connections all have a wait list length of 1. I looked at a few of the others, and they have a wait list of length 0. I can try a higher connection limit. However something looks weird. I don't understand why the HttpWebRequests were assigned connections that were already waiting. Why not use an idle connection? – Paul Williams Sep 21 '15 at 13:27
  • I am only aware of some old locking issue for HWR documented in https://support.microsoft.com/en-us/kb/2497453, but not sure if your scenario is related to that. – hugh Sep 22 '15 at 19:23