2

While following up on some windbg tutorials I have noticed that some callstacks using k command are in this format, specially mine

Child-SP          RetAddr           Call Site

While other online resources like CodeProject have the k command spit out info in this format

Child-EBP          RetAddr           Call Site

I am confused to why there is a difference between my output and theirs and to what that truly means.

Lewis Kelsey
  • 4,129
  • 1
  • 32
  • 42
Mohamed341
  • 43
  • 4
  • 1
    Child-SP is for x86-64-bit stacks. Child-EBP is for x86-32 stacks. – Raymond Chen May 14 '20 at 21:12
  • Does SP here stand for stack pointer vs EBP which is base pointer? if so, why do both represent different locations in memory? – Mohamed341 May 14 '20 at 21:18
  • 1
    Each stack dump is tailored to the architecture. Not sure why you're asking why two different things represent different things. They represent different things because they are different. – Raymond Chen May 14 '20 at 21:19
  • Apologies let me explain myself. My understanding is ESP and EBP are universal constructs, so i am assuming that say if you want to walk stack, you want to use those values exact same way no matter x86 or x64. Unless SP here is not equivalent to ESP, which i would appreciate if you can help me understand how to use this address which is CHILD-SP or CHILD-EBP – Mohamed341 May 14 '20 at 21:21
  • 1
    They are not universal concepts. x86-64 code rarely uses ebp for stack frames. ARM doesn't even have an ebp register at all! – Raymond Chen May 15 '20 at 03:05

1 Answers1

1

It depends on what calling convention is being used. Some functions allocate locals and parameters that it will pass to the functions it calls using the base pointer, such as cdecl, but Windows x64 calling convention uses rsp.

Child-SP is the value the stack pointer of that frame, which will be the byte before the return address for all frames except for frame 0, which might have a breakpoint before or during the prologue. If the breakpoint is on the first instruction, then the rsp would have only decreased by 8 on the Child-SP of the previous frame; this rsp value is read from the trap frame. The Child-SP is the address of the frame at the frame number if you do not consider the return address of the callee to be part of the frame (which makes sense in this scenario because the return address column of the frame is showing the return address to the previous frame to be part of this frame).

ChildEBP is the value of ebp when in that frame (not the ebp that is pushed to the frame, but the new value of ebp)

RetAddr is the return address that belongs to the frame, so it is the address it will return to

The call site is the address of the instruction after the call instruction that was called that ended the frame (so basically the callee return address -- call instruction + call instruction length), or the address of the breakpoint or other exception that caused it to break into the debugger (note: the address of the exception, not the instruction after it, so this will be rip in the trap frame) in the case of frame 0 (the frame at the top of the stack). This will indicate the name of the function that owns the frame on this row as long as the function doesn't contain a label, and args to child are the arguments passed to it. Indeed the callsite is the return address of the frame that it calls (the frame above it).

Args to child are the arguments passed to the function that owns the stack frame on the current row, not the arguments passed to the callee function it calls in the allocated space by the prologue. This is almost never accurate on x64, where it shows the first 3 quadwords of the homespace, because the first 4 arguments can be passed in registers, and the callee function may not home these arguments (save them in the homespace) if it is -O0 or not a varargs function, or may put something else in the homespace entirely. 'Child' in 'Args to child' and 'Child-SP' is a false and misleading name which implies the function the frame calls, but it actually refers to the current function.

There is an example stacktrace on this site which shows a breakpoint on notepad!ShowOpenSaveDialog, which will be the first instruction.

0:011> bp notepad!ShowOpenSaveDialog
0:011> g
Breakpoint 0 hit
notepad!ShowOpenSaveDialog:
00007ff7`0307182c 48895c2408      mov     qword ptr [rsp+8],
           rbx ss:00000073`74d2f310=0000000000000000
0:000> k
 # Child-SP          RetAddr           Call Site
00 00000073`74d2f308 00007ff7`03071aeb notepad!ShowOpenSaveDialog
01 00000073`74d2f310 00007ff7`030721fa notepad!InvokeOpenDialog+0x14f
02 00000073`74d2f370 00007ff7`030738d6 notepad!NPCommand+0x4a2
03 00000073`74d2f6f0 00007fff`664b6d41 notepad!NPWndProc+0x726
04 00000073`74d2f9f0 00007fff`664b6713 USER32!UserCallWinProcCheckWow+0x2c1
05 00000073`74d2fb80 00007ff7`03073bdb USER32!DispatchMessageWorker+0x1c3
06 00000073`74d2fc10 00007ff7`03089333 notepad!WinMain+0x27f
07 00000073`74d2fd10 00007fff`68ea3034 notepad!__mainCRTStartup+0x19f
08 00000073`74d2fdd0 00007fff`69073691 KERNEL32!BaseThreadInitThunk+0x14
09 00000073`74d2fe00 00000000`00000000 ntdll!RtlUserThreadStart+0x21
0:000> r @rcx=0
0:000> r
rax=0000000000000000 rbx=0000000000000000 rcx=0000000000000000
rdx=00000213f5020968 rsi=000000499cb1f338 rdi=00000213f5021c20
rip=00007ff70307182c rsp=000000499cb1f288 rbp=000000499cb1f2d0
 r8=00000213f4ffe9d6  r9=000000499cb1f338 r10=00000ffee060e246
r11=0000000000014140 r12=0000000000000000 r13=0000000000000001
r14=000000000004094e r15=00000000ffffffff
iopl=0         nv up ei pl zr na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246
notepad!ShowOpenSaveDialog:
00007ff7`0307182c 48895c2408      mov     qword ptr [rsp+8],
rbx ss:00000049`9cb1f290=0000000000000000
0:000> ub notepad!InvokeOpenDialog+0x14f
notepad!InvokeOpenDialog+0x133:
00007ff7`03071acf 8bd8            mov     ebx,eax
00007ff7`03071ad1 85c0            test    eax,eax
00007ff7`03071ad3 782c            js      notepad!InvokeOpenDialog+0x165 (00007ff7`03071b01)
00007ff7`03071ad5 4c8b05fc080200  mov     r8,qword ptr [notepad!szOpenCaption (00007ff7`030923d8)]
00007ff7`03071adc 4c8bce          mov     r9,rsi
00007ff7`03071adf 488b5538        mov     rdx,qword ptr [rbp+38h]
00007ff7`03071ae3 498bce          mov     rcx,r14
00007ff7`03071ae6 e841fdffff      call    notepad!ShowOpenSaveDialog (00007ff7`0307182c)

The trap frame created by nt!KiBreakpointTrap is not part of the stack trace, because trap frames are pushed to the threads's kernel stack not the user stack.

If you are kernel debugging then you will see a fusion of the user and kernel stacks if there is a trap frame and the process address space is accessible:

lkd> .process /P fffffa80723296f0
lkd> .reload
lkd> ld *
lkd> !process 3490                                                                                                                             
Searching for Process with Cid == 3490                                                                                                         
Cid handle table at fffff8a00195f000 with 4126 entries in use                                                                                  
                                                                                                                                               
PROCESS fffffa80723296f0                                                                                                                       
    SessionId: 1  Cid: 3490    Peb: 7fffffdf000  ParentCid: 1470                                                                               
    DirBase: 403874000  ObjectTable: fffff8a02b3a2ce0  HandleCount: 293.                                                                       
    Image: chrome.exe                                                                                                                          
    VadRoot fffffa805fc816e0 Vads 295 Clone 0 Private 16586. Modified 1536. Locked 0.                                                          
    DeviceMap fffff8a00208d0b0                                                                                                                 
    Token                             fffff8a03466d9e0                                                                                         
    ElapsedTime                       00:23:43.376                                                                                             
    UserTime                          00:00:00.717                                                                                             
    KernelTime                        00:00:00.000                                                                                             
    QuotaPoolUsage[PagedPool]         0                                                                                                        
    QuotaPoolUsage[NonPagedPool]      0                                                                                                        
    Working Set Sizes (now,min,max)  (27282, 50, 345) (109128KB, 200KB, 1380KB)                                                                
    PeakWorkingSetSize                33917                                                                                                    
    VirtualSize                       870 Mb                                                                                                   
    PeakVirtualSize                   891 Mb                                                                                                   
    PageFaultCount                    81452                                                                                                    
    MemoryPriority                    BACKGROUND                                                                                               
    BasePriority                      4                                                                                                        
    CommitCharge                      19316                                                                                                    
    Job                               fffffa805f9f46d0                                                                                         
        THREAD fffffa802e890b50  Cid 3490.469c  Teb: 000007fffffdd000 Win32Thread: fffff900c53a48c0 WAIT: (UserRequest) UserMode Non-Alertable 
            fffffa8062daf060  SynchronizationEvent                                                                                             
        Not impersonating                                                                                                                      
        DeviceMap                 fffff8a00208d0b0                                                                                             
        Owning Process            fffffa80723296f0       Image:         chrome.exe                                                             
        Attached Process          N/A            Image:         N/A                                                                            
        Wait Start TickCount      45844297       Ticks: 42 (0:00:00:00.655)                                                                    
        Context Switch Count      21234                 LargeStack                                                                             
        UserTime                  00:00:07.300                                                                                                 
        KernelTime                00:00:00.234                                                                                                 
        Win32 Start Address chrome!IsSandboxedProcess (0x000000013fbcbe90)                                                                     
        Stack Init fffff8802d5bec70 Current fffff8802d5be7c0                                                                                   
        Base fffff8802d5bf000 Limit fffff8802d5b7000 Call 0                                                                                    
        Priority 4 BasePriority 4 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5 
        Child-SP          RetAddr           Call Site                                                                                          
        fffff880`2d5be800 fffff800`0367ec32 nt!KiSwapContext+0x7a                                                                              
        fffff880`2d5be940 fffff800`0368145f nt!KiCommitThreadWait+0x1d2                                                                        
        fffff880`2d5be9d0 fffff800`0397602e nt!KeWaitForSingleObject+0x19f                                                                     
        fffff880`2d5bea70 fffff800`03678c13 nt!NtWaitForSingleObject+0xde                                                                      
        fffff880`2d5beae0 00000000`7782bd7a nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffff880`2d5beae0)                                     
        00000000`002eec08 000007fe`fd6110ac ntdll!NtWaitForSingleObject+0xa                                                                    
        00000000`002eec10 000007fe`d97c2107 KERNELBASE!WaitForSingleObjectEx+0x79                                                              
        00000000`002eecb0 000007fe`d858aeab chrome_child!GetHandleVerifier+0x15d8417                                                           
        00000000`002eed40 000007fe`d823004d chrome_child!GetHandleVerifier+0x3a11bb                                                            
        00000000`002eedb0 000007fe`d822fc91 chrome_child!GetHandleVerifier+0x4635d                                                             
        00000000`002eee10 000007fe`d8217c59 chrome_child!GetHandleVerifier+0x45fa1                                                             
        00000000`002eee40 000007fe`d82176de chrome_child!GetHandleVerifier+0x2df69                                                             
        00000000`002ef040 000007fe`d82105d1 chrome_child!GetHandleVerifier+0x2d9ee                                                             
        00000000`002ef1b0 000007fe`d81e518b chrome_child!GetHandleVerifier+0x268e1                                                             
        00000000`002ef250 000007fe`d81e4c58 chrome_child!ChromeMain+0x377d                                                                     
        00000000`002ef570 000007fe`d81e1b2e chrome_child!ChromeMain+0x324a                                                                     
        00000000`002ef600 00000001`3faf354c chrome_child!ChromeMain+0x120                                                                      
        00000000`002ef6d0 00000001`3faf1699 chrome+0x354c                                                                                      
        00000000`002ef7c0 00000001`3fbcbe33 chrome+0x1699                                                                                      
        00000000`002efba0 00000000`775d59cd chrome!IsSandboxedProcess+0x61483                                                                  
        00000000`002efbe0 00000000`7780a561 kernel32!BaseThreadInitThunk+0xd                                                                   
        00000000`002efc10 00000000`00000000 ntdll!RtlUserThreadStart+0x1d                                                                      

The trap frame is made by nt!KiSystemCall64, which has a label in it called nt!KiSystemServiceCopyEnd, which makes the call to the system service, in this case nt!NtWaitForSingleObject. The trap frame is the first stack frame hence function on the kernel stack and begins at the rsp (Child-SP) of the function nt!KiSystemCall64 and starts with the homespace that it allocates for the next function. The first instruction of nt!KiSystemCall64 is swapgs to swap gs with the gs in IA32_KERNEL_GS_BASE, and then it stores rsp into gs:10h, which is the UserRsp field in the KPCR and then stores gs:1A8h into rsp, which is the RspBase field in the KPRCB, this is clearly swapping the user rsp with the kernel rsp for the thread, which is cached in the hyperthread that currently has that thread running's KPRCB. It then starts building the trap frame, where the rsp is currently pointing to offset 0x190 in the trap frame by definition (the byte after the end of the frame), starting with pushing SS, because the CPU doesn't do this when you use syscall instead of int 0x2e.

When a base pointer is used, the trap frame would be at 8 less than the ebp of the frame above it (subtract ebp and return address, 4 bytes each), so the frame of nt!NtWaitForSingleObject

Exceptions have an exception frame that stores the non-volatile registers using nt!KiExceptionDispatch, after nt!KiBreakpointTrap for instance creates the trap frame (and the trap frame building different to a system call because the processor pushes SS:RSP RFLAGS, CS:RIP and ErrorCode to the stack for a user-mode exception and RFLAGS, CS:RIP and ErrorCode to the stack for a kernel-mode exception, and does not save/load any rsp from the KPCR / KPRCB). nt!KiExceptionDispatch creates an exception frame KEXCEPTION_FRAME beginning at the next rsp in the stack trace and the EXCEPTION_RECORD is immediately after above that on the stack, with a total combined size of 1D8, so you'll see a difference of 1E0 between the stack pointers of nt!KiExceptionDispatch and nt!KiBreakpointTrap when you include the return address to nt!KiBreakpointTrap.

The trap frame stores all registers that can be clobbered by the kernel during the array of functions that are called in the kernel, called volatile registers, because the volatile registers are guaranteed not to be changed during a syscall or an exception. It does not save non-volatile registers because the functions that are called in the kernel save any non-volatile registers they use, so by the time it returns to this function to perform a sysret, all the nonvolatile registers will be the value they were before. You need an exception frame for exceptions because it needs to know in the function that performs the stack unwinding what the nonvolatile registers contained at the time of the exception so that the stack can be unwound. A CONTEXT structure is created on the stack in nt!KiDispatchException to represent the full context at the time of the exception, which is a combination of the nonvolatile and volatile register state.

You also need another exception frame when swapping thread contexts in order to save the state of the nonvolatile registers at the time the thread is switched out, so this will be the state in nt!KiSwapContext, so this can be restored when the thread is switched back in. The volatile state does not need to be saved, because it's assumed that the call to SwapContext discards all volatile registers. The trap frame created when entering kernel mode in the first place will be used to restore the volatile register context to the thread when it returns to user mode.

Fun fact: for a breakpoint exception INT 3, the CPU pushes the rip starting after the end of the breakpoint instruction and not the breakpoint instruction which means that windows needs to decrement the address so the top of the stack callsite is the rip in the trap frame and not the return address pushed by the CPU.

Lewis Kelsey
  • 4,129
  • 1
  • 32
  • 42