-1

I'm seeing increased CPU and http queuing on my .NET API (using Nancy/OWIN).

It's running on Azure, so i took a memory dump.

I'm trying to figure out how to read the DebugDiag Analysis report.

Here's the warning section: enter image description here

I assume the "62 client connections" is the cause of the queueing, so i digged into that section.

There are a few common requests to the one URL, looking like this:

Host Header MyApi:80 GET request for /companies/list HTTP Version HTTP/1.1 SSL Request True Time alive 1 day(s) 03:43:21 QueryString
Request mapped to
HTTP Request State HTR_READING_CLIENT_REQUEST Native Request State NREQ_STATE_PROCESS Client Connection State CCS_STARTUP

But i don't really know what that means. Is that saying that the request is still open, and taking 1 day so far? They also feature in the "Running since" section.

The code is running async/await, so i'm wondering if i'm not handling cancellations etc properly?

Can anyone give me some advice on how to analyze the memory dump to diagnose the high cpu/queuing?

Thanks

RPM1984
  • 72,246
  • 58
  • 225
  • 350

1 Answers1

2

The report is telling you that there 62 requests which are running for more than 90 seconds on the server and they are still not finished executing so that is a problem for sure to investigate. The time reported (1 day) seems really high though so I am not sure if that is really accurate (though I would say that I have seen memory dumps where we see requests stuck for day long time). 62 request may not really mean queuing though.

 

The report also has  HTTP Context section which might tell you the thread on which the long running request is running and clicking on the thread should give you information on what that thread is really doing or what is it stuck on.

 

DebugDiag reports also have sections where they display callstack of each thread. you might have to quickly scan the entire report and see if you can see threads which have a .net callstack which gives an indication of a HTTP request being processed.

 

If you can share the full report, I can provide more details or feel free to open a case with Microsoft support and someone should be able to help.

After Looking at the Dump File

I spent some time trying to debug the dump file and all the requests are stuck in the OWIN Middleware of ASP.NET. Basically this module - webengine4-__DynamicModule_Microsoft.Owin.Host.SystemWeb.OwinHttpModule,

What this means is that you are hooking into OWIN Middleware using app.use method inside the Startup class's Configuration Method and one of those methods are asynchronously waiting on something that is taking a long time to execute...Found a DB

I just found one SQL timeout exception which is possibly triggered from the middleware code and this could possibly be related but not 100 % sure

000> !pe 0x000000cbe3d71e18 
Exception object: 000000cbe3d71e18
Exception type:   System.Data.SqlClient.SqlException
Message:          Connection Timeout Expired.  The timeout period elapsed during the post-login phase.  The connection could have timed out while waiting for server to complete the login process and respond; Or it could have timed out while attempting to create multiple active connections.  The duration spent while attempting to connect to this server was - [Pre-Login] initialization=2859; handshake=11534; [Login] initialization=0; authentication=0; [Post-Login] complete=3510; 
InnerException:   System.ComponentModel.Win32Exception, Use !PrintException 000000cbe3d71ce8 to see more.
StackTrace (generated):
    SP               IP               Function
    000000CB896582C0 000007FE9E4DCC82 System_Data_ni!System.Data.SqlClient.SqlInternalConnectionTds..ctor(System.Data.ProviderBase.DbConnectionPoolIdentity, System.Data.SqlClient.SqlConnectionString, System.Data.SqlClient.SqlCredential, System.Object, System.String, System.Security.SecureString, Boolean, System.Data.SqlClient.SqlConnectionString, System.Data.SqlClient.SessionData, System.Data.ProviderBase.DbConnectionPool, System.String, Boolean)+0x572
    000000CB8965EC10 000007FE9E019480 System_Data_ni!System.Data.SqlClient.SqlConnectionFactory.CreateConnection(System.Data.Common.DbConnectionOptions, System.Data.Common.DbConnectionPoolKey, System.Object, System.Data.ProviderBase.DbConnectionPool, System.Data.Common.DbConnection, System.Data.Common.DbConnectionOptions)+0x460
    000000CB8965ED00 000007FE9E019006 System_Data_ni!System.Data.ProviderBase.DbConnectionFactory.CreatePooledConnection(System.Data.ProviderBase.DbConnectionPool, System.Data.Common.DbConnection, System.Data.Common.DbConnectionOptions, System.Data.Common.DbConnectionPoolKey, System.Data.Common.DbConnectionOptions)+0x46
    000000CB8965ED60 000007FE9E018FB4 System_Data_ni!System.Data.ProviderBase.DbConnectionPool.CreateObject(System.Data.Common.DbConnection, System.Data.Common.DbConnectionOptions, System.Data.ProviderBase.DbConnectionInternal)+0x3c4
    000000CB8965EDE0 000007FE9E018942 System_Data_ni!System.Data.ProviderBase.DbConnectionPool.UserCreateRequest(System.Data.Common.DbConnection, System.Data.Common.DbConnectionOptions, System.Data.ProviderBase.DbConnectionInternal)+0x72
    000000CB8965EE30 000007FE9E01866F System_Data_ni!System.Data.ProviderBase.DbConnectionPool.TryGetConnection(System.Data.Common.DbConnection, UInt32, Boolean, Boolean, System.Data.Common.DbConnectionOptions, System.Data.ProviderBase.DbConnectionInternal ByRef)+0x65f
    000000CB8965EEB0 000007FE9E5A58A9 System_Data_ni!System.Data.ProviderBase.DbConnectionPool.WaitForPendingOpen()+0x159
    000000CB9A7DE920 000007FEAA21E814 mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(System.Threading.Tasks.Task)+0xd34d54
    000000CB9A7DE960 000007FEA94E9A0D mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3d
    000000CB9A7DE990 000007FE4C714BF0 AppianMedia_XWing_Application_Web_Api_Homely!AppianMedia.XWing.Application.Web.Api.Homely.Repository.DapperRepository+<FindListingsAsync>d__50.MoveNext()+0x290
    000000CB9A7DC2A0 000007FEAA21E814 mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(System.Threading.Tasks.Task)+0xd34d54
    000000CB9A7DC2E0 000007FEA94E9A0D mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3d
    000000CB9A7DC310 000007FEAA21E795 mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd(System.Threading.Tasks.Task)+0xd34de5
    000000CB9A7DC340 000007FE4C6DEF8F AppianMedia_XWing_Application_Web_Api_Homely!AppianMedia.XWing.Application.Web.Api.Homely.Services.Caching.Listing.ListingCache+<FindListingsForLocationsAsync>d__39.MoveNext()+0xd9f
    000000CB9A7D9DD0 000007FEAA21E814 mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(System.Threading.Tasks.Task)+0xd34d54
    000000CB9A7D9E10 000007FEA94E9A0D mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3d
    000000CB9A7D9E40 000007FE4C6DA757 AppianMedia_XWing_Application_Web_Api_Homely!AppianMedia.XWing.Application.Web.Api.Homely.Services.Caching.Listing.ListingCache+<FindAllListingResultsAsync>d__24.MoveNext()+0x167
    000000CB9A7D7900 000007FEAA21E814 mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(System.Threading.Tasks.Task)+0xd34d54
    000000CB9A7D7940 000007FEA94E9A0D mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3d
    000000CB9A7D7970 000007FE4CA4E50E AppianMedia_XWing_Application_Web_Api_Homely!AppianMedia.XWing.Application.Web.Api.Homely.Services.Caching.Listing.ListingCache+<GetSimilarListingIdsAsync>d__26.MoveNext()+0x26e
    000000CB9A7D53B0 000007FEAA21E814 mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(System.Threading.Tasks.Task)+0xd34d54
    000000CB9A7D53F0 000007FEA94E9A0D mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3d
    000000CB9A7D5420 000007FE4CA46F0C AppianMedia_XWing_Application_Web_Api_Homely!AppianMedia.XWing.Application.Web.Api.Homely.Services.Caching.Listing.ListingCache+<GetListingDetailsAsync>d__25.MoveNext()+0x8bc
    000000CB9A7D2EF0 000007FEAA21E814 mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(System.Threading.Tasks.Task)+0xd34d54
    000000CB9A7D2F30 000007FEA94E9A0D mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3d
    000000CB9A7D2F60 000007FE4CA46396 AppianMedia_XWing_Application_Web_Api_Homely!AppianMedia.XWing.Application.Web.Api.Homely.Modules.Listing.ListingModule+<DetailsAsync>d__2.MoveNext()+0x106
    000000CB9A7D09D0 000007FEAA21E814 mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(System.Threading.Tasks.Task)+0xd34d54
    000000CB9A7D0A10 000007FEA94E9A0D mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3d
    000000CB9A7D0A40 000007FE4CBD4635 UNKNOWN!DynamicClass.CallSite.Target(System.Runtime.CompilerServices.Closure, System.Runtime.CompilerServices.CallSite, System.Object)+0x65
    000000CB9A7D0A80 000007FE4CA45C01 AppianMedia_XWing_Application_Web_Api_Homely!AppianMedia.XWing.Application.Web.Api.Homely.Modules.Listing.ListingModule+<<-ctor>b__1_0>d.MoveNext()+0x5f1

StackTraceString: <none>

If this doesnt help, then you need to use some CLR level profilers (PerfView, NewRelic etc.) or use tools like Glimpse to get more details or add some tracing in our MiddleWare code.

Puneet Gupta
  • 2,237
  • 13
  • 17
  • thanks for your reply. I had a look at your suggestions but still can't figure it out :/ I've attached the full report here: http://www.filedropper.com/w3wp1800151c2016-07-2103-58-45-6170708multiplerules. Appreciate your help :) – RPM1984 Jul 22 '16 at 03:52
  • any ideas/suggestions? – RPM1984 Aug 08 '16 at 01:01
  • The link is not working for me now and I didn't see your comment earlier...sorry...can you share it via onedrive ? – Puneet Gupta Aug 09 '16 at 10:22
  • So in this case the DebugDiag report is not helping even though there are a lot of long running requests. One reason when this can happen is if the code is using asynchronous programming model in which case if the request execution is wait on some async call, it will not show up on a thread. I am afraid that for this case, a closer look at the dump file is required to understand why the requests are hanging. – Puneet Gupta Aug 10 '16 at 07:44
  • Yep, it's all async/await. I don't have VS enterprise so i can't easily analyse the dump file. Any tips? – RPM1984 Aug 10 '16 at 21:19
  • Rather than Visual Studio, we need to look at dump using WinDBG and the steps are not going to be that simple or straightforward unfortunately. – Puneet Gupta Aug 11 '16 at 05:36
  • Ugh. WinDBG. So painful :/ – RPM1984 Aug 11 '16 at 23:11
  • If you can share the dump file, I can give a quick look :-) – Puneet Gupta Aug 13 '16 at 16:19
  • edited my answer after looking at the dump file. Hope this helps – Puneet Gupta Aug 18 '16 at 12:19
  • Thanks for looking into this, really appreciate it. Do you know which piece of OWIN middleware is causing the issue? I have a few pieces of MW, the main one being Nancy (this is an API). Otherwise it might be a needle in a haystack. Can you give me some pointers into where i should be tracing/logging to get more insights? – RPM1984 Aug 18 '16 at 23:44
  • Try using Glimpse http://getglimpse.com/ or Perfview to make progress - https://channel9.msdn.com/Blogs/Developer-Support-Series-PGI-Sessions/Developer-Support-Series-Detect-ASPNET-Performance-Bottlenecks-in-a-Matter-of-Minutes-with-PerfViewH – Puneet Gupta Aug 21 '16 at 14:34
  • I can't use Glimpse, as it doesn't support OWIN/self host (im using Nancy). I can't use Perfview as the app is on an Azure Website :( – RPM1984 Aug 22 '16 at 00:11
  • Try using azureserviceprofiler.com ... it is something fairly new so not sure if it will work 100% but worth a try – Puneet Gupta Aug 22 '16 at 05:37
  • Okay ill look into and try and do some load testing to replicate locally. Thanks for all your help – RPM1984 Aug 22 '16 at 22:29
  • azureserviceprofiler.com will work against your Azure WebApp only so you may not have a need to replicate if that's ok with you – Puneet Gupta Aug 23 '16 at 08:02