1

I'm investigating a webapp which ran up to 10gb of memory, by analysing a memory dump using Windbg.

Here's the bottom of the !dumpheap -stat output:

00007ff9545df5d0   166523     13321840 System.Runtime.Caching.MemoryCache
00007ff9545df4a0   166523     14654024 System.Runtime.Caching.CacheMemoryMonitor
00007ff9545de990   166523     14654024 System.Runtime.Caching.SRef[]
00007ff9545dcef0   166523     14654024 System.Runtime.Caching.GCHandleRef`1[[System.Runtime.Caching.MemoryCacheStore, System.Runtime.Caching]][]
00007ff9545dfb28   166523     19982760 System.Runtime.Caching.MemoryCacheStatistics
00007ff956778510   333059     21315680 System.Int64[]
00007ff95679c988    41597     31250111 System.Byte[]
00007ff9545e08c8  1332184     31972416 System.Runtime.Caching.MemoryCacheEqualityComparer
00007ff9545dfe48  1332184     31972416 System.Runtime.Caching.SRef
00007ff956780ff0  1332200     31972800 System.SizedReference
00007ff956724620  1498777     35970648 System.Threading.TimerHolder
00007ff95677fb30  1536170     36868080 System.Runtime.Remoting.Messaging.CallContextSecurityData
00007ff956796f28  1606960     38567040 System.Object
00007ff9545df810  1332184     42629888 System.Runtime.Caching.GCHandleRef`1[[System.Runtime.Caching.MemoryCacheStore, System.Runtime.Caching]]
00007ff9545dda38  1332184     42629888 System.Runtime.Caching.UsageBucket[]
00007ff9567ae930  1332268     42632576 Microsoft.Win32.SafeHandles.SafeWaitHandle
00007ff9545df968  1498707     47958624 System.Runtime.Caching.GCHandleRef`1[[System.Threading.Timer, mscorlib]]
00007ff9567adbf8  1498777     47960864 System.Threading.Timer
00007ff9545dff50  1332184     53287360 System.Runtime.Caching.CacheUsage
00007ff94986ead8  1536137     61445480 System.Web.Hosting.AspNetHostExecutionContextManager+AspNetHostExecutionContext
00007ff9567a2838  1332210     63946080 System.Threading.ManualResetEvent
00007ff956796948   293525     66384986 System.String
00007ff9545dfef0  1332184     74602304 System.Runtime.Caching.CacheExpires
00007ff9567add20  1498760     95920640 System.Threading.TimerCallback
00007ff9545dfa90  1332184    106574720 System.Runtime.Caching.MemoryCacheStore
00007ff95679b3b0  1333289    106663120 System.Collections.Hashtable
00007ff95678f138  1536171    110604312 System.Runtime.Remoting.Messaging.LogicalCallContext
00007ff9545dffb0  1332184    127889664 System.Runtime.Caching.UsageBucket
00007ff95679d1e0  1333292    128664768 System.Collections.Hashtable+bucket[]
00007ff9567245c0  1498777    131892376 System.Threading.TimerQueueTimer
00007ff9567aec48  1536255    135190440 System.Threading.ExecutionContext
00007ff9545dcf78  1332184    351696576 System.Runtime.Caching.ExpiresBucket[]
000000f82c79d9f0   473339    385303992      Free
00007ff956799220 40309535   1617342672 System.Int32[]
00007ff9545e0468 39965520   3836689920 System.Runtime.Caching.ExpiresBucket

So there are nearly 40 million instances of System.Runtime.Caching.ExpiresBucket, totally nearly 4gb of the used memory. System.Runtime.Caching classes appear quite a lot in the top offenders.

I took a random instance of a System.Runtime.Caching.ExpiresBucket class, and did a !gcroot on it. It took ages (maybe 30 mins) to produce 1 thread...there may have been more but I interrupted the operation at this point.

The chain of references is over 1.5 million lines long! But I can show the important bits here:

0:000> !gcroot 000000f82dd4bc28
Thread 1964:
    000000fcbdbce6a0 00007ff8f9bbe388 Microsoft.AspNet.SignalR.SqlServer.ObservableDbOperation.ExecuteReaderWithUpdates(System.Action`2<System.Data.IDataRecord,Microsoft.AspNet.SignalR.SqlServer.DbOperation>)
        rbp-58: 000000fcbdbce6e8
            ->  000000fa2d1f26a0 Microsoft.AspNet.SignalR.SqlServer.ObservableDbOperation+<>c__DisplayClass1e
            ->  000000fa2d1f2110 Microsoft.AspNet.SignalR.SqlServer.ObservableDbOperation
            ->  000000fa2d1f24d0 System.Action
            ->  000000fa2d1f24a8 System.Object[]
            ->  000000fa2d1f2468 System.Action
            ->  000000fa2d1f1008 Microsoft.AspNet.SignalR.SqlServer.SqlReceiver
            ->  000000fa2d1f1330 System.Action
            ->  000000fa2d1f1308 System.Object[]
            ->  000000fa2d1f12c8 System.Action
            ->  000000fa2d1efb70 Microsoft.AspNet.SignalR.SqlServer.SqlStream
            ->  000000fa2d1f1528 System.Action
            ->  000000fa2d1f1500 System.Object[]
            ->  000000fa2d1f14c0 System.Action
            ->  000000fa2d1efb20 Microsoft.AspNet.SignalR.SqlServer.SqlMessageBus+<>c__DisplayClass3
            ->  000000f92d0b84e0 Microsoft.AspNet.SignalR.SqlServer.SqlMessageBus
            ->  000000f92d0b9568 System.Threading.Timer
            ->  000000f92d0b96d8 System.Threading.TimerHolder
            ->  000000f92d0b95a0 System.Threading.TimerQueueTimer
[... about 100 lines of the same TimerQueueTimer line above, but different memory addresses each time]
           ->  000000f92cf1be68 System.Threading.TimerQueueTimer
            ->  000000f92cf1be08 System.Threading.TimerCallback
            ->  000000f92cf1bb48 System.Web.RequestTimeoutManager
            ->  000000f92cf1bb80 System.Web.Util.DoubleLinkList[]
            ->  000000f92cf1bc00 System.Web.Util.DoubleLinkList
            ->  000000fb61323860 System.Web.RequestTimeoutManager+RequestTimeoutEntry
            ->  000000fb6131fd38 System.Web.HttpContext
            ->  000000fbe682a480 ASP.global_asax
            ->  000000fbe682ac00 System.Web.HttpModuleCollection
            ->  000000fbe682ac60 System.Collections.ArrayList
            ->  000000fbe682b598 System.Object[]
            ->  000000fbe682b018 System.Collections.Specialized.NameObjectCollectionBase+NameObjectEntry
            ->  000000fbe682b000 System.Web.Routing.UrlRoutingModule
            ->  000000faacec1f40 System.Web.Routing.RouteCollection
            ->  000000faacec2030 System.Collections.Generic.List`1[[System.Web.Routing.RouteBase, System.Web]]
            ->  000000fa2cfe4d80 System.Web.Routing.RouteBase[]
            ->  000000f9acf14cd8 System.Web.Http.WebHost.Routing.HttpWebRoute
            ->  000000f9acf149f8 System.Web.Http.Routing.RouteCollectionRoute
            ->  000000f9acf1f4f0 System.Web.Http.Routing.SubRouteCollection
            ->  000000f9acf1f510 System.Collections.Generic.List`1[[System.Web.Http.Routing.IHttpRoute, System.Web.Http]]
            ->  000000fa2cf8f310 System.Web.Http.Routing.IHttpRoute[]
            ->  000000fa2ceff770 System.Web.Http.Routing.HttpRoute
            ->  000000fa2ceff678 System.Web.Http.Routing.HttpRouteValueDictionary
            ->  000000fa2ceff6f0 System.Collections.Generic.Dictionary`2+Entry[[System.String, mscorlib],[System.Object, mscorlib]][]
            ->  000000fa2cef9e78 System.Web.Http.Controllers.HttpActionDescriptor[]
            ->  000000fa2cef7898 System.Web.Http.Controllers.ReflectedHttpActionDescriptor
            ->  000000f9aced4608 System.Web.Http.HttpConfiguration
            ->  000000f9aced4db0 System.Net.Http.Formatting.MediaTypeFormatterCollection
            ->  000000f9aced6f40 System.Collections.Generic.List`1[[System.Net.Http.Formatting.MediaTypeFormatter, System.Net.Http.Formatting]]
            ->  000000f9aced6f80 System.Net.Http.Formatting.MediaTypeFormatter[]
            ->  000000f9aced4df8 System.Net.Http.Formatting.JsonMediaTypeFormatter
            ->  000000f9acf1f448 System.Web.Http.Validation.ModelValidationRequiredMemberSelector
            ->  000000f9acf1f468 System.Collections.Generic.List`1[[System.Web.Http.Validation.ModelValidatorProvider, System.Web.Http]]
            ->  000000f9acf1f490 System.Web.Http.Validation.ModelValidatorProvider[]
            ->  000000f9acf1db40 Ninject.Web.WebApi.Validation.NinjectDefaultModelValidatorProvider
            ->  000000faaceca438 Ninject.StandardKernel
            ->  000000faaceca498 Ninject.Components.ComponentContainer
            ->  000000faaceca538 System.Collections.Generic.Dictionary`2[[System.Type, mscorlib],[Ninject.Components.INinjectComponent, Ninject]]
            ->  000000f9acece000 System.Collections.Generic.Dictionary`2+Entry[[System.Type, mscorlib],[Ninject.Components.INinjectComponent, Ninject]][]
            ->  000000f9acecdac8 Ninject.Activation.Caching.GarbageCollectionCachePruner
            ->  000000f9acecdcb8 System.Threading.Timer
            ->  000000f9acecdd30 System.Threading.TimerHolder
            ->  000000f9acecdcd8 System.Threading.TimerQueueTimer
[... just under 1.5 million lines of the same TimerQueueTimer line above, but different memory addresses each time]
            ->  000000f82dd4c028 System.Threading.TimerQueueTimer
            ->  000000f82dd4bfc8 System.Threading.TimerCallback
            ->  000000f82dd4ada0 System.Runtime.Caching.CacheExpires
            ->  000000f82dd4add8 System.Runtime.Caching.ExpiresBucket[]
            ->  000000f82dd4bc28 System.Runtime.Caching.ExpiresBucket

I'm running !objsize on the 000000f92d0b84e0 Microsoft.AspNet.SignalR.SqlServer.SqlMessageBus, but it's been running for about 20 mins so far, and I'm expecting a very high number.

What is causing the leak? Is there a known bug in SignalR, using Backplane? We're using v2.0.2, which I believe is not the latest. But I've not found any documentation on the internet referring to a memory leak involving both SignalR and Caching.

Edit:

To answer Pawel's question about instances of SqlMessageBus, here is some more Windbg output:

0:000> !dumpheap -type SqlMessageBus
         Address               MT     Size
0000007ced6832a0 00007ff8f9edb708       24     
0000007ced6832b8 00007ff8f9a97da8       64     
0000007ced6832f8 00007ff8f9a97c90      248     
0000007ced685370 00007ff8f9edbc18       24     
0000007d6d917f48 00007ff8f9a97ec8       40     
0000007feda7c400 00007ff8f9e88b98       32     

Statistics:
              MT    Count    TotalSize Class Name
00007ff8f9edbc18        1           24 System.Lazy`1+Boxed[[Microsoft.AspNet.SignalR.SqlServer.SqlMessageBus, Microsoft.AspNet.SignalR.SqlServer]]
00007ff8f9edb708        1           24 System.Lazy`1+<>c[[Microsoft.AspNet.SignalR.SqlServer.SqlMessageBus, Microsoft.AspNet.SignalR.SqlServer]]
00007ff8f9e88b98        1           32 Microsoft.AspNet.SignalR.SqlServer.SqlMessageBus+<>c__DisplayClass3
00007ff8f9a97ec8        1           40 System.Lazy`1[[Microsoft.AspNet.SignalR.SqlServer.SqlMessageBus, Microsoft.AspNet.SignalR.SqlServer]]
00007ff8f9a97da8        1           64 System.Func`1[[Microsoft.AspNet.SignalR.SqlServer.SqlMessageBus, Microsoft.AspNet.SignalR.SqlServer]]
00007ff8f9a97c90        1          248 Microsoft.AspNet.SignalR.SqlServer.SqlMessageBus
Simon Green
  • 1,131
  • 1
  • 10
  • 28
  • See this https://github.com/SignalR/SignalR/issues/3369 as it may be relevant. I suggest you use latest to iron out legacy issues if you can otherwise it is bit of pain to dig in. – Farrukh Subhani Nov 09 '17 at 15:04
  • Hi Farrukh, yes had already seen that. Hard to say that it's relevant, as in that case the Redis support for backplane was blamed, but we're not using that, ours is SQL Server. I would preferably like definitive proof that the problem lies in SignalR code, and is fixed in later versions, before I suggest an upgrade which will in turn mean a new deployment to the client – Simon Green Nov 09 '17 at 15:33
  • You seem to be using Ninject. Are you using it with SignalR? Are you registering SqlMessageBus there? How many instances of SqlMessageBus do you see? The built-in DI container makes sure that there is only one SqlMessageBus by using lazy: https://github.com/SignalR/SignalR/blob/1fba14fa3437e24c204dfaf8a18db3fce8acad3c/src/Microsoft.AspNet.SignalR.SqlServer/DependencyResolverExtensions.cs#L38-L39 – Pawel Nov 09 '17 at 17:27
  • I don't know how to answer "Are you using it with SignalR? Are you registering SqlMessageBus there?" Can you tell me what these questions mean and what to look for in the code to answer? And there is only 1 instance of SqlMessageBus in the dump - see my edit above for details – Simon Green Nov 10 '17 at 07:19

1 Answers1

4

In our case, the reason in the end was because we had a class that was creating a new instance of MemoryCache in its constructor... and we were supposed to be binding that class as a Singleton, but we weren't. So essentially we were ending up with hundreds of thousands of MemoryCache instances.

Simon Green
  • 1,131
  • 1
  • 10
  • 28