1

I have a .NET Winform application and its UI hung.

The UI thread is blocked on the CritSec ntdll!LdrpLoaderLock+0 at 774920c0.

0:010> kb
ChildEBP RetAddr  Args to Child              
0fc4e034 773c8df4 000000d4 00000000 00000000 ntdll!NtWaitForSingleObject+0x15
0fc4e098 773c8cd8 00000000 00000000 773bfa84 ntdll!RtlpWaitOnCriticalSection+0x13e
0fc4e0c0 773bffd3 774920c0 71e65850 00000001 ntdll!RtlEnterCriticalSection+0x150
0fc4e230 773bfd2f 00000001 00000001 00000000 ntdll!LdrGetDllHandleEx+0x2f7
0fc4e24c 75dd1a43 00000001 00000000 0fc4e2bc ntdll!LdrGetDllHandle+0x18
WARNING: Stack unwind information not available. Following frames may be wrong.
0fc4e2a0 75dd1c57 0fc4e2bc b7a43509 02baed94 KERNELBASE!GetModuleFileNameW+0x1a9
0fc4e718 75dd1d52 00000001 00000002 02baed94 KERNELBASE!GetModuleFileNameW+0x3bd
0fc4e730 5fc58fc5 02baed94 b108f787 7285ac08 KERNELBASE!GetModuleHandleW+0x29
...

And the critical section 774920c0 is owned by thread 1bd4.

0:010> !locks
CritSec ntdll!LdrpLoaderLock+0 at 774920c0
WaiterWoken        No
LockCount          32
RecursionCount     1
OwningThread       1bd4
EntryCount         0
ContentionCount    27a
*** Locked

CritSec +1027c5bc at 1027c5bc
WaiterWoken        No
LockCount          3
RecursionCount     1
OwningThread       1a4c
EntryCount         0
ContentionCount    1bb
*** Locked

CritSec +17700138 at 17700138
WaiterWoken        No
LockCount          1
RecursionCount     1
OwningThread       1620
EntryCount         0
ContentionCount    d
*** Locked

The Windbg thread sequence id of thread 1bd4 is 68.

0:010> ~
#  0  Id: ccc.124c Suspend: 1 Teb: 7efdd000 Unfrozen
   1  Id: ccc.88c Suspend: 1 Teb: 7efda000 Unfrozen
   ...
   67  Id: ccc.514 Suspend: 1 Teb: 7ef36000 Unfrozen
   68  Id: ccc.1bd4 Suspend: 1 Teb: 7ef30000 Unfrozen
   69  Id: ccc.1a30 Suspend: 1 Teb: 7ef1b000 Unfrozen
   ...

However, 68 doesn't exist in the output of !threads, or does that mean 68 is a dead thread?

0:010> !threads
ThreadCount:      65
UnstartedThread:  13
BackgroundThread: 37
PendingThread:    13
DeadThread:       5
Hosted Runtime:   no
                                                                         Lock  
       ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt Exception
   0    1 124c 004db888     26020 Preemptive  3E3A01CC:00000000 004cf1b0 0     STA 
   2    2 14e0 004e7a20     2b220 Preemptive  00000000:00000000 004cf1b0 0     MTA (Finalizer) 
 ...
   8   10 19c4 055691e0   3029220 Preemptive  00000000:00000000 004cf1b0 0     MTA (Threadpool Worker) 
  10   12 1520 0f91cbc8     27220 Preemptive  00000000:00000000 004cf1b0 0     STA
 ...
  66   46 169c 29e344b8   3029220 Preemptive  00000000:00000000 004cf1b0 0     MTA (Threadpool Worker) 
XXXX   33    0 19ff4a90   8039820 Preemptive  00000000:00000000 004cf1b0 0     Ukn (Threadpool Completion Port) 
XXXX   55    0 29cbe540   8039820 Preemptive  00000000:00000000 004cf1b0 0     Ukn (Threadpool Completion Port) 
  69    6 1a30 2a1be4d8   a029220 Preemptive  00000000:00000000 004cf1b0 0     MTA (Threadpool Completion Port) 
 ...

When I try to walk through the call stack of 68 with !clrstack, it fails. But I still can use kb to check its native stack trace.

0:068> !clrstack
OS Thread Id: 0x1bd4 (68)
Unable to walk the managed stack. The current thread is likely not a 
managed thread. You can run !threads to get a list of managed threads in
the process
Failed to start stack walk: 80070057
0:068> kb
ChildEBP RetAddr  Args to Child              
2c85f580 773c8df4 000019ac 00000000 00000000 ntdll!NtWaitForSingleObject+0x15
2c85f5e4 773c8cd8 00000000 00000000 2bc40458 ntdll!RtlpWaitOnCriticalSection+0x13e
2c85f60c 773f5f33 17700138 52a74c34 175f3f40 ntdll!RtlEnterCriticalSection+0x150
2c85f654 773c740d 00000000 17700000 2bc404b8 ntdll!RtlpFreeUserBlock+0x47
2c85f690 773be023 2bc404b8 3b6af630 27620000 ntdll!RtlpLowFragHeapFree+0x2cc
2c85f6a8 75a814ad 17700000 00000000 2bc404b8 ntdll!RtlFreeHeap+0x105
WARNING: Stack unwind information not available. Following frames may be wrong.

I can see the thread 68 is blocked on another critical section 17700138, which belongs to another thread 1620. That should be the root cause why the UI thread hung.

But my question is what's the truth of thread 68? If it's dead, how can it own the critical section CritSec ntdll!LdrpLoaderLock+0 at 774920c0 and has its call stack?

Steven
  • 43
  • 4
  • 1
    Dead (XXXX) means dead from a .NET perspective. This means that there is no .NET thread that corresponds to the given OS thread. That's all it means. In your !threads output, we can't see thread 1bd4, likely because it is not managed at all. Clearly, thread 1bd4 still exists, as you were able to walk the stack in the debugger. Your UI thread is trying to load a module, but cannot due so due to what looks like an unmanaged deadlock. You'll have to walk the chain, looking at the stacks until you find the two locks that are causing the deadlock. !sosex.dlk may help you here. – Steve Johnson Jun 03 '15 at 00:16
  • Although in this case the thread does still exist (but is unmanaged) note that there is nothing preventing a thread from exiting while it owns a critical section, and the critical section is *not* implicitly exited. (You can use a mutex instead if you want Windows to notice that the owning thread no longer exists.) – Harry Johnston Jun 03 '15 at 01:58
  • @Steve I found another post explaining the ["dead thread'](http://blogs.msdn.com/b/yunjin/archive/2005/08/29/457150.aspx). It says a dead thread refers to a C++ thread which no longer has an active OS thread. Anyway, no matter it's "real" dead or still alive for OS thread. It does own the critical section. – Steven Jun 03 '15 at 02:54
  • @Harry does that mean if the owner thread exit normally or be terminated without explicitly "Exit" the critical section, the other thread waiting on this critical section will be blocked infinity? – Steven Jun 03 '15 at 02:59
  • That's the most likely outcome in practice. In principle, exiting a thread that holds a critical section is illegal, so anything might happen; random data corruption, an outright crash, nasal flying monkeys ... – Harry Johnston Jun 03 '15 at 03:05
  • 2
    (In this particular case, since the critical section is `ntdll!LdrpLoaderLock+0` I imagine this is the loader lock. It probably means a DLL is doing something illegal in its DllMain function.) – Harry Johnston Jun 03 '15 at 03:10
  • I agree with @HarryJohnston. Is thread 68 calling free() (or some other memory deallocating funciton) in DllMain()? – Marc Sherman Jun 04 '15 at 14:41

0 Answers0