9

This issue is related to a WCF hosted in a windows service on a windows server 2003.

The issue happens after several days of correct working and get fixed only by rebooting the server.

The C# code doesn't catch any exception in its log

serviceHost.Opening += new EventHandler(serviceHost_Opening);
serviceHost.Opened += new EventHandler(serviceHost_Opened);
serviceHost.Closing += new EventHandler(serviceHost_Closing);
serviceHost.Closed += new EventHandler(serviceHost_Closed);
serviceHost.Faulted += new EventHandler(serviceHost_Faulted);
serviceHost.UnknownMessageReceived += new EventHandler<UnknownMessageReceivedEventArgs>(serviceHost_UnknownMessageReceived);

serviceHost.Open();

The handlers are implemented like that:

 void serviceHost_Opened(object sender, EventArgs e)
{ 
    CentralReport.MyService.SrvLog("WinCentralRpt",String.Format("service opened by sender: {0}", sender.GetType().ToString()));
}

void serviceHost_Opening(object sender, EventArgs e)
{
    CentralReport.MyService.SrvLog("WinCentralRpt",String.Format("service opening by sender: {0}", sender.GetType().ToString()));
}    

public static void SrvLog(string user, string line) {
    string log_path = System.Configuration.ConfigurationManager.AppSettings["srv_log"];
    if (log_path != null) {
        using (System.IO.StreamWriter logSW = new System.IO.StreamWriter(
            log_path.Replace("{user}",user.ToLower()),true)) {
            logSW.WriteLine(DateTime.Now.ToString("yyyy-MM-dd hh:mm:ss tt", CultureInfo.InvariantCulture) + ": " + line);
        }   
    }
}

On client side, C# WCF consumer correctly closes the connection, after invoking an operation.

Maximum level tracing has been activated.

   <system.diagnostics>
      <sources>
            <source name="System.ServiceModel" 
                        switchValue="All"
                        propagateActivity="true">
                <listeners>
                        <add name="xml" />
                </listeners>
             </source>
            <source name="CardSpace"
                        switchValue="All">
                <listeners>
                    <add name="xml" />
                </listeners>
            </source>
            <source name="System.IO.Log"
                        switchValue="All">
                <listeners>
                    <add name="xml" />
                </listeners>
            </source>
            <source name="System.Runtime.Serialization"
                        switchValue="All">
                <listeners>
                    <add name="xml" />
                </listeners>
            </source>
            <source name="System.IdentityModel"
                        switchValue="All">
                <listeners>
                    <add name="xml" />
                </listeners>
            </source>
            <source name="System.ServiceModel.MessageLogging"
                        switchValue="All">
                <listeners>
                    <add name="xml" />
                </listeners>
            </source>
            <source name="System.ServiceModel.Activation"
                        switchValue="All">
                <listeners>
                    <add name="xml" />
                </listeners>
            </source>
      </sources>

        <sharedListeners>
            <add name="xml"
                 type="System.Diagnostics.XmlWriterTraceListener"
                 initializeData="C:\MyPath\Traces.svclog" />
        </sharedListeners>
<trace autoflush="true" />
   </system.diagnostics> 

with

    <system.serviceModel>
  <diagnostics wmiProviderEnabled="true">
      <messageLogging 
           logEntireMessage="true" 
           logMalformedMessages="true"
           logMessagesAtServiceLevel="true" 
           logMessagesAtTransportLevel="true"
           maxMessagesToLog="3000" 
       />
  </diagnostics>

and

    <serviceBehaviors>
        <behavior name="MyServiceBehavior">
            <dataContractSerializer maxItemsInObjectGraph="6553500"/>
            <serviceMetadata httpGetEnabled="true"/>
            <serviceDebug includeExceptionDetailInFaults="true"/>
        </behavior>
    </serviceBehaviors>

Operating system checks seem fine

netstat -ao | grep 9nnn
tasklist /SVC | grep Opxxx

My feeling was that the channel listeners are not working, but nothing shows any issues in the trace log. The server just stops responding to requests for this WCF, but it writes nothing to logs from that moment on. From a visual perspective here it is the non-responding versus the ok graph.

non-responding - vs - enter image description here

Could someone suggest a softer solution than rebooting the server? Obviously, already tried to restart the host service (and other system services like dns and ipsec) and even reinstalled the WCF. Only rebooting the server does the trick.

Any idea what could be the root cause?

Edit - after the reboot

After the reboot - without any other changes - the WCF started to work again.

Side note

There is another minor point that I don't fully understand. Running now - when everything is working fine - the command

httpcfg.exe query urlacl 

that is the windows 2003 version of

netsh http show urlacl

I would have expected to see the WCF url but the output is only (why?)

C:\>httpcfg.exe query urlacl
    URL : http://+:80/Temporary_Listen_Addresses/
    ACL : D:(A;;GX;;;WD)
-------------------------------------------------------
  • 1
    What is the context of all those events? Maybe `serviceHost` is not getting collected because the event handlers retain a reference. If you really need them then make sure you `-=` them before closing. – Crowcoder Apr 26 '16 at 17:34
  • @Crowcoder Question edited to include an example of handlers: they simply write a log file. But consider also that I've activated the much more detailed wcf tracing... –  Apr 26 '16 at 17:40
  • well, I don't know. I'd say try running ANTs profiler and make your logging thread safe with a logging library or queue intermediary. – Crowcoder Apr 26 '16 at 17:55
  • @Crowcoder I appreciate it, thanks a lot but notice that the wcf stops responding also to the metadata wsdl request that has nothing to do with my handlers, it's completely standard and it's supposed to be captured only by tracing... BTW when I restart the service, even when it's not responding, the windows service restart gets regularly tracked in the log... –  Apr 26 '16 at 18:15
  • Most likely you are hitting some throttle limit (because of some bug which prevents sessions/instances from being closed. What is your throttle configuration? https://msdn.microsoft.com/en-us/library/ms731379(v=vs.110).aspx. Try to set throttle values very low,then make some calls and see if you can reproduce your problem. Also inspect windows log for messages from wcf - it should trace when those limits are reached. – Evk May 02 '16 at 16:34
  • Is there anything in the Windows Event Log? What is the status of the "Net.Tcp Listener Adapter" service when this happens? – JAZ May 02 '16 at 17:17
  • There was nothing relevant in the event logs. If I don't go wrong, net.tcp listener adpt is IIS service, but there is no IIS installed, only an old .net 4.0. @Evk yep, I also would be surprised that wcf tracing does not track throttle (I'll try to do the experiment): only a reboot can reset the limit? I guess just a "firewall-like" could reproduce all that conditions, but also in remote desktop on the server, the localhost:9nnn...?wsdl mex didn't respond!! ... Is better changing the host service's user from local system to local service? Any suggestion about httpcfg urlacl or similar cmd? –  May 02 '16 at 19:27
  • Instead of rebooting he server, does resetting (stop/start) the network adapter(s) cause it to start responding again? – Ton Plooij May 05 '16 at 12:26

4 Answers4

1

It's a bug in the ThreadPool for WCF of old dated .Net 4.0 versions. You can find some references here and here.

You should move the service account from local system to local service. This is the windows 2003 command to reserve your URL for the local service account.

httpcfg set urlacl /u http://+:9nnn/Your/Url/ /a "D:(A;;GX;;;LS)"

Furthermore, you can just enable the WCF performance counters and turn off the WCF tracing.

<system.serviceModel>
  <diagnostics performanceCounters="All" />
  ...

They will show up under the performance monitor's Service.ModelService category.

In conclusion, replace the threading operations in your WCF service (like DB access, etc...) with a separate process or with a distinct (non WCF) worker service.

Related issues

Similar issues have been documented worldwide

Technical Bulletin: WCF services may become unresponsive after running with high load

If the symptoms apply, restart the server hosting the unresponsive service.

An interesting msdn suggestion may be appropriate here

Troubleshooting: Connection Forcibly Closed

Cause:

Faulty network hardware is dropping portions of the TCP traffic

or

The SynAttackProtect setting may be dropping connections.

Looking at the following registry may prove useful

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters

The following article mentions 2 reasons:

  • OS runs out of memory for TCP buffers
  • OS runs out of available TCP “ephemeral” ports (KB Microsoft referring to the above said registry tcp/ip parameters)
Community
  • 1
  • 1
1

The underlying problem

It is a Nonpaged Pool resource issue.

The unresponsive WCF is just an early bird message, sent before this happens.

Indeed, the following is a typical symptom of memory leak

You can still ping the server, and it still processes some basic commands, like a shutdown command, but it does not allow browsing files or folders, and you cannot RDP to the server.

You have a couple of tools to monitor it:

  • process explorer
  • poolmon

Read these two articles by Pearman and Russinovich for full guidance.

Side notes

You can also check the http log under \System32\LogFiles\HTTPERR.

There is a possible hotfix to be applied.

  • Right. As I can see from the [Error logging HTTP APIs](https://support.microsoft.com/en-us/kb/820729), the presence of `Connections_Refused` proves that http.sys has stopped receiving new connections cause NonPagedPool memory has dropped below the limit. –  Oct 07 '16 at 11:03
0

I have had weird issues before with MemoryStream's and Binary Writers on older versions of Windows with web services before. Nothing ever made much sense, don't know if it were a hardware limitation or some weird one-off .Net bug. Maybe try pulling the Stream out of the using block and handling it with a Try-Finally situation. In the past with the issue with the binary writers, I would have to handle it myself and it worked fine. Almost like the "using" block was attempting to close out of everything before it was finished.

Kevin B Burns
  • 1,032
  • 9
  • 24
0

Are you consuming services from your services? Remember to close connections everywhere you are consuming services (even on server-side).

In WCF as you may know there exist a ChannelFactory, that factory provides you with your service proxy but it is a ICommunicationObject too, you must dispose that service proxy instance to close the connection.

Check your ServiceHost destruction with a directive PerRequest, in order to monitor if disposal of service host instances are allowed and if there exist hard references taken in other parts of your code disallowing be Gcollected.

I think that you may have runned out the maximum connection count.