3

Our Web Application has begun crashing of no reason, and I have no clue at the moment to what it can be.

We are running Basic Authentication for SOAP services and ADFS for the main web application. The crashes can occur at any time during the day. It is a test environment, and has fairly low traffic. I have extracted some logs below when the crash was detected.

<Event>
    <System>
      <Provider Name="ASP.NET 4.0.30319.0"/>
      <EventID>1309</EventID>
      <Level>2</Level>
      <Task>0</Task>
      <Keywords>Keywords</Keywords>
      <TimeCreated SystemTime="2015-06-12T11:23:21Z"/>
      <EventRecordID>274964734</EventRecordID>
      <Channel>Application</Channel>
      <Computer>RD0003FF410F64</Computer>
      <Security/>
    </System>
    <EventData>
      <Data>3001</Data>
      <Data>The request has been aborted.</Data>
      <Data>6/12/2015 11:23:21 AM</Data>
      <Data>6/12/2015 11:23:21 AM</Data>
      <Data>b1c5d35e8a26444ba38a8c6a0af0236f</Data>
      <Data>1305</Data>
      <Data>4</Data>
      <Data>0</Data>
      <Data>/LM/W3SVC/698610343/ROOT-1-130784515189471125</Data>
      <Data>Full</Data>
      <Data>/</Data>
      <Data>D:\home\site\wwwroot\</Data>
      <Data>RD0003FF410F64</Data>
      <Data></Data>
      <Data>6384</Data>
      <Data>w3wp.exe</Data>
      <Data>IIS APPPOOL\xxxx-test</Data>
      <Data>HttpException</Data>
      <Data>
        Request timed out.

      </Data>
      <Data>https://xxx.yy:443/</Data>
      <Data>/</Data>
      <Data>111.11.11.11</Data>
      <Data></Data>
      <Data>False</Data>
      <Data></Data>
      <Data>IIS APPPOOL\xxxx</Data>
      <Data>963</Data>
      <Data>IIS APPPOOL\xxxx</Data>
      <Data>False</Data>
      <Data>
      </Data>
    </EventData>
  </Event>
</Events>


 <EventData>
      <Data>3005</Data>
      <Data>An unhandled exception has occurred.</Data>
      <Data>6/18/2015 5:43:35 AM</Data>
      <Data>6/18/2015 5:43:35 AM</Data>
      <Data>ff2588624f0f47bc86f14cb636d4ca12</Data>
      <Data>1759</Data>
      <Data>3</Data>
      <Data>0</Data>
      <Data>/LM/W3SVC/1001219836/ROOT-1-130789123624036190</Data>
      <Data>Full</Data>
      <Data>/</Data>
      <Data>D:\home\site\wwwroot\</Data>
      <Data>RD0003FF410F64</Data>
      <Data></Data>
      <Data>6988</Data>
      <Data>w3wp.exe</Data>
      <Data>IIS APPPOOL\xxx__70d6</Data>
      <Data>WebException</Data>
      <Data>
        Unable to connect to the remote server
        at System.Net.HttpWebRequest.GetResponse()
        at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ExecuteSync[T](RESTCommand`1 cmd, IRetryPolicy policy, OperationContext operationContext)

        An attempt was made to access a socket in a way forbidden by its access permissions 111.11.11.111:443
        at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress)
        at System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket&amp; socket, IPAddress&amp; address, ConnectSocketState state, IAsyncResult asyncResult, Exception&amp; exception)

      </Data>
      <Data>https://111.111.11.11:443/</Data>
      <Data>/</Data>
      <Data>111.111.11.11</Data>
      <Data></Data>
      <Data>False</Data>
      <Data></Data>
      <Data>IIS APPPOOL\xxx__70d6</Data>
      <Data>1116</Data>
      <Data>IIS APPPOOL\xxx__70d6</Data>
      <Data>False</Data>
      <Data>
        at System.Net.HttpWebRequest.GetResponse()
        at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ExecuteSync[T](RESTCommand`1 cmd, IRetryPolicy policy, OperationContext operationContext)
      </Data>
    </EventData>
  </Event>
Mikael Nyborg
  • 111
  • 3
  • 12

2 Answers2

10

Azure webapps have limits on maximum number of TCP connections that can be made simultaneously at a given point of time and the error that you are getting "An attempt was made to access a socket in a way forbidden..." typically happens when this limit is hit. This limit is higher in Large instances and less is small instances (I think 4000 for small but I may be wrong)....You may face this situation if you are not closing TCP connections properly to external services OR opening thousands of connections in an interval of few minutes. Most of the times, the issue is not closing connections properly. Isolating which site is opening connections can become a bit challenging if you have many sites hosted in the same App hosting plan but if you have just a few sites in one hosting plan, then you can collect a dump using DAAS (Diagnostic as a service) WHEN THE ISSUE IS HAPPENING and you will have to download the dumps locally and open them in tools like WinDBG to see how many System.Net.Sockets.Socket object are there. If you can, you may want to isolate the site responsible for opening too many connections by splitting sites in different app hosting plans or just scale them to a larger instance to allow Moore TCP connections....

Troubleshooting this is a bit trickier so you can engage Microsoft Support and they an assist but hope this gives you a starting point... If you need further assistance, please email me puneetg[at]Microsoft.com and we can try a few things and post that we can share our findings here with the community. I am trying to see how we can make troubleshooting this scenario easier in future

EDIT - December 4, 2017

As of now, you can monitor TCP Connections for your WebApp by going to "Diagnose and Solve" blade and clicking on TCP Connections. Quick screenshots available @ https://twitter.com/puneetguptams/status/936669451931459584

Puneet Gupta
  • 2,237
  • 13
  • 17
  • Thank you very much for your answer and for pointing me in the right direction. I was exactly too many tcp-ports open on the server as you said. – Mikael Nyborg Jun 24 '15 at 07:15
2

I tried to use the crash-dumps and run the through WinDBG with various result. It was hard to get any real information out of WinDBG as I hade a hard time getting all symbols to load correctly. So I built a windows console app instead and deployed my application and my console app to the same Azure Cloud service and collected information about open tcp-ports. The result was very clear then as I saw that my Redis-Cache never (or very seldom) closed it's tcp-ports and I soon hade more than 3000 connections and the server crashed. I refactored my code to use table-storage instead and now it seems to work. I attach my little console-app for anyone who is interested in testing their own apps for leaking tcp-ports.

    using System;
    using System.Collections.Generic;
    using System.Linq;

    namespace tcp_ports
    {
        using System.Net.NetworkInformation;
        using System.Threading;

        class Program
        {
            static void Main(string[] args)
            {
                do
                {
                    IPGlobalProperties properties = IPGlobalProperties.GetIPGlobalProperties();
                    TcpConnectionInformation[] connections = properties.GetActiveTcpConnections();
                    Dictionary<String, int> ips = new Dictionary<string, int>();
                    Dictionary<String, String> ipsLocal = new Dictionary<String, String>();

                    Console.Clear();
                    Console.WriteLine("Number of open TCP Connections = {0}", connections.Count());
                    Console.WriteLine("=========================================");

                    foreach (TcpConnectionInformation c in connections)
                    {
                        String ip = c.RemoteEndPoint.Address.ToString();
                        if (ips.ContainsKey(ip))
                        {
                            ips[ip]++;
                            ipsLocal[ip] = c.LocalEndPoint.Address.ToString();
                        }
                        else
                        {
                            ips.Add(ip, 1);
                            ipsLocal.Add(ip, c.LocalEndPoint.Address.ToString());
                        }
                    }

                    var sortedIPs = from entry in ips orderby entry.Value descending select entry;

                    int no = 20;
                    foreach (var ip in sortedIPs)
                    {
                        Console.WriteLine("{0} <==> {1} = {2}", ip.Key, ipsLocal[ip.Key], ip.Value);
                        if (--no < 0) break;
                    }

                    Thread.Sleep(1000);

                } while (true);

            }
        }
    }
Mikael Nyborg
  • 111
  • 3
  • 12
  • I seriously appreciate you putting this code snippet here. i was under the impression that this API will require Admin privileges and will be blocked in Webapps but looks like it is not. I will see how we can get this added in to some of the built in diagnostics that we have so that other customers can diagnose issues like this easily. Thanks again !!! – Puneet Gupta Jun 25 '15 at 16:41
  • Ahh, I think I replied too fast...This code doesn't run in Azure WebApps because the API is indeed blocked...This is what I got when I ran this code sample in an Azure WebApp Unhandled Exception: System.Net.NetworkInformation.NetworkInformationException: Access is denied at System.Net.NetworkInformation.SystemIPGlobalProperties.GetAllTcpConnections() at System.Net.NetworkInformation.SystemIPGlobalProperties.GetActiveTcpConnections() at managednetstat.Program.Main(String[] args) Looks like you moved the app to a cloud service and then tested this there.... – Puneet Gupta Jun 25 '15 at 16:54
  • Well, I cheated and deployed my app to a cloud-service where I could reproduce the tcp-connection errors, and then deployed my test-app where I could see my zombie-connections :) – Mikael Nyborg Jun 26 '15 at 05:59