I am trying to diagnose a memory leak in a C# client application. This application:
- runs in a hardened Windows environment
- communicates with a local un-managed third-party API
- communicates via tcp to a server application
- plays wav files via waveOutWrite()
- integrates with a custom USB keyboard via keyboard vendor dll
- accepts user input to perform actions against the third-party API
The application under normal usage, depending on customer configuration, uses between 50 and 100 MB of memory. Our most recent update to this application ran without issue for several weeks (confirmed no memory issue during this time). Then, without any code changes or any changes to the client machines that the customer is aware of, we started experiencing the following:
- Uncontrolled, rapid and/or gradual memory growth until outofmemoryexceptions are thrown
- Intermittent delayed/erratic responsiveness from the custom keyboard
- waveOutWrite() returning error value 1 when we attempt to play audio (happens before memory gets near max usage)
I have used DebugDiag 1.2 to monitor for leaks and have the resulting full dump. Initial warnings from the analysis are:
Dump shows 1.19 GB of allocation in Native Heaps. 634MB is from the Microsoft VC Runtime Heap (private), and 549MB from the DebugDiag LeakTrack heap. The 634MB heap has 44 segments, most of which are 15.81MB.
However, the allocations report doesn't seem to correspond. The top allocation by size is 992KB and is also the top by count at qty 3. Here are the top allocations for the 634MB heap:
Am I reading this wrong?
Moving to WinDbg, if I run !heap -stat -h [634MBheapaddress] -grp B
, I get:
group-by: BLOCKCOUNT max-display: 20
size #blocks total ( %) (percent of totalblocks)
44 9ea51 - 2a23d84 (44.14)
1a 1fba7 - 338ef6 (8.83)
18 1b3b8 - 28d940 (7.58)
10 16913 - 169130 (6.28)
12 1222c - 146718 (5.05)
22 d9d0 - 1ceda0 (3.79)
1c 9bea - 110d98 (2.71)
14 9197 - b5fcc (2.53)
26 9115 - 15891e (2.52)
20 6774 - cee80 (1.80)
24 5094 - b54d0 (1.40)
30 4b03 - e1090 (1.30)
78 4a81 - 22ec78 (1.30)
28 48d2 - b60d0 (1.27)
4 48bb - 122ec (1.26)
58 48aa - 18fa70 (1.26)
1e 48a6 - 88374 (1.26)
2a 48a3 - beabe (1.26)
16 4898 - 63d10 (1.26)
600 4884 - 1b31800 (1.26)
If I'm reading this correctly, it shows the top allocation as 68 bytes and having 650k allocations. Is this correct? If so, it could be a potential issue, but only represents 44MB - nowhere near the 650MB I'm showing as reserved.
Either way, I'm not sure at this point how to figure out what these allocations are or what is making them. And I am at a loss as to why the issues would have started occurring without any code change on our side. I have to assume something has changed on the customer systems that they are not aware of and which has brought to light a bug in our code, but so far I've had no luck figuring out the root cause.
Any help would be greatly appreciated!