It depends on what calling convention is being used. Some functions allocate locals and parameters that it will pass to the functions it calls using the base pointer, such as cdecl, but Windows x64 calling convention uses rsp
.
Child-SP is the value the stack pointer of that frame, which will be the byte before the return address for all frames except for frame 0, which might have a breakpoint before or during the prologue. If the breakpoint is on the first instruction, then the rsp
would have only decreased by 8 on the Child-SP of the previous frame; this rsp
value is read from the trap frame. The Child-SP is the address of the frame at the frame number if you do not consider the return address of the callee to be part of the frame (which makes sense in this scenario because the return address column of the frame is showing the return address to the previous frame to be part of this frame).
ChildEBP is the value of ebp
when in that frame (not the ebp
that is pushed to the frame, but the new value of ebp
)
RetAddr is the return address that belongs to the frame, so it is the address it will return to
The call site is the address of the instruction after the call instruction that was called that ended the frame (so basically the callee return address -- call instruction + call instruction length
), or the address of the breakpoint or other exception that caused it to break into the debugger (note: the address of the exception, not the instruction after it, so this will be rip
in the trap frame) in the case of frame 0 (the frame at the top of the stack). This will indicate the name of the function that owns the frame on this row as long as the function doesn't contain a label, and args to child are the arguments passed to it. Indeed the callsite is the return address of the frame that it calls (the frame above it).
Args to child are the arguments passed to the function that owns the stack frame on the current row, not the arguments passed to the callee function it calls in the allocated space by the prologue. This is almost never accurate on x64, where it shows the first 3 quadwords of the homespace, because the first 4 arguments can be passed in registers, and the callee function may not home these arguments (save them in the homespace) if it is -O0
or not a varargs function, or may put something else in the homespace entirely. 'Child' in 'Args to child' and 'Child-SP' is a false and misleading name which implies the function the frame calls, but it actually refers to the current function.
There is an example stacktrace on this site which shows a breakpoint on notepad!ShowOpenSaveDialog
, which will be the first instruction.
0:011> bp notepad!ShowOpenSaveDialog
0:011> g
Breakpoint 0 hit
notepad!ShowOpenSaveDialog:
00007ff7`0307182c 48895c2408 mov qword ptr [rsp+8],
rbx ss:00000073`74d2f310=0000000000000000
0:000> k
# Child-SP RetAddr Call Site
00 00000073`74d2f308 00007ff7`03071aeb notepad!ShowOpenSaveDialog
01 00000073`74d2f310 00007ff7`030721fa notepad!InvokeOpenDialog+0x14f
02 00000073`74d2f370 00007ff7`030738d6 notepad!NPCommand+0x4a2
03 00000073`74d2f6f0 00007fff`664b6d41 notepad!NPWndProc+0x726
04 00000073`74d2f9f0 00007fff`664b6713 USER32!UserCallWinProcCheckWow+0x2c1
05 00000073`74d2fb80 00007ff7`03073bdb USER32!DispatchMessageWorker+0x1c3
06 00000073`74d2fc10 00007ff7`03089333 notepad!WinMain+0x27f
07 00000073`74d2fd10 00007fff`68ea3034 notepad!__mainCRTStartup+0x19f
08 00000073`74d2fdd0 00007fff`69073691 KERNEL32!BaseThreadInitThunk+0x14
09 00000073`74d2fe00 00000000`00000000 ntdll!RtlUserThreadStart+0x21
0:000> r @rcx=0
0:000> r
rax=0000000000000000 rbx=0000000000000000 rcx=0000000000000000
rdx=00000213f5020968 rsi=000000499cb1f338 rdi=00000213f5021c20
rip=00007ff70307182c rsp=000000499cb1f288 rbp=000000499cb1f2d0
r8=00000213f4ffe9d6 r9=000000499cb1f338 r10=00000ffee060e246
r11=0000000000014140 r12=0000000000000000 r13=0000000000000001
r14=000000000004094e r15=00000000ffffffff
iopl=0 nv up ei pl zr na po nc
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246
notepad!ShowOpenSaveDialog:
00007ff7`0307182c 48895c2408 mov qword ptr [rsp+8],
rbx ss:00000049`9cb1f290=0000000000000000
0:000> ub notepad!InvokeOpenDialog+0x14f
notepad!InvokeOpenDialog+0x133:
00007ff7`03071acf 8bd8 mov ebx,eax
00007ff7`03071ad1 85c0 test eax,eax
00007ff7`03071ad3 782c js notepad!InvokeOpenDialog+0x165 (00007ff7`03071b01)
00007ff7`03071ad5 4c8b05fc080200 mov r8,qword ptr [notepad!szOpenCaption (00007ff7`030923d8)]
00007ff7`03071adc 4c8bce mov r9,rsi
00007ff7`03071adf 488b5538 mov rdx,qword ptr [rbp+38h]
00007ff7`03071ae3 498bce mov rcx,r14
00007ff7`03071ae6 e841fdffff call notepad!ShowOpenSaveDialog (00007ff7`0307182c)
The trap frame created by nt!KiBreakpointTrap
is not part of the stack trace, because trap frames are pushed to the threads's kernel stack not the user stack.
If you are kernel debugging then you will see a fusion of the user and kernel stacks if there is a trap frame and the process address space is accessible:
lkd> .process /P fffffa80723296f0
lkd> .reload
lkd> ld *
lkd> !process 3490
Searching for Process with Cid == 3490
Cid handle table at fffff8a00195f000 with 4126 entries in use
PROCESS fffffa80723296f0
SessionId: 1 Cid: 3490 Peb: 7fffffdf000 ParentCid: 1470
DirBase: 403874000 ObjectTable: fffff8a02b3a2ce0 HandleCount: 293.
Image: chrome.exe
VadRoot fffffa805fc816e0 Vads 295 Clone 0 Private 16586. Modified 1536. Locked 0.
DeviceMap fffff8a00208d0b0
Token fffff8a03466d9e0
ElapsedTime 00:23:43.376
UserTime 00:00:00.717
KernelTime 00:00:00.000
QuotaPoolUsage[PagedPool] 0
QuotaPoolUsage[NonPagedPool] 0
Working Set Sizes (now,min,max) (27282, 50, 345) (109128KB, 200KB, 1380KB)
PeakWorkingSetSize 33917
VirtualSize 870 Mb
PeakVirtualSize 891 Mb
PageFaultCount 81452
MemoryPriority BACKGROUND
BasePriority 4
CommitCharge 19316
Job fffffa805f9f46d0
THREAD fffffa802e890b50 Cid 3490.469c Teb: 000007fffffdd000 Win32Thread: fffff900c53a48c0 WAIT: (UserRequest) UserMode Non-Alertable
fffffa8062daf060 SynchronizationEvent
Not impersonating
DeviceMap fffff8a00208d0b0
Owning Process fffffa80723296f0 Image: chrome.exe
Attached Process N/A Image: N/A
Wait Start TickCount 45844297 Ticks: 42 (0:00:00:00.655)
Context Switch Count 21234 LargeStack
UserTime 00:00:07.300
KernelTime 00:00:00.234
Win32 Start Address chrome!IsSandboxedProcess (0x000000013fbcbe90)
Stack Init fffff8802d5bec70 Current fffff8802d5be7c0
Base fffff8802d5bf000 Limit fffff8802d5b7000 Call 0
Priority 4 BasePriority 4 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP RetAddr Call Site
fffff880`2d5be800 fffff800`0367ec32 nt!KiSwapContext+0x7a
fffff880`2d5be940 fffff800`0368145f nt!KiCommitThreadWait+0x1d2
fffff880`2d5be9d0 fffff800`0397602e nt!KeWaitForSingleObject+0x19f
fffff880`2d5bea70 fffff800`03678c13 nt!NtWaitForSingleObject+0xde
fffff880`2d5beae0 00000000`7782bd7a nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffff880`2d5beae0)
00000000`002eec08 000007fe`fd6110ac ntdll!NtWaitForSingleObject+0xa
00000000`002eec10 000007fe`d97c2107 KERNELBASE!WaitForSingleObjectEx+0x79
00000000`002eecb0 000007fe`d858aeab chrome_child!GetHandleVerifier+0x15d8417
00000000`002eed40 000007fe`d823004d chrome_child!GetHandleVerifier+0x3a11bb
00000000`002eedb0 000007fe`d822fc91 chrome_child!GetHandleVerifier+0x4635d
00000000`002eee10 000007fe`d8217c59 chrome_child!GetHandleVerifier+0x45fa1
00000000`002eee40 000007fe`d82176de chrome_child!GetHandleVerifier+0x2df69
00000000`002ef040 000007fe`d82105d1 chrome_child!GetHandleVerifier+0x2d9ee
00000000`002ef1b0 000007fe`d81e518b chrome_child!GetHandleVerifier+0x268e1
00000000`002ef250 000007fe`d81e4c58 chrome_child!ChromeMain+0x377d
00000000`002ef570 000007fe`d81e1b2e chrome_child!ChromeMain+0x324a
00000000`002ef600 00000001`3faf354c chrome_child!ChromeMain+0x120
00000000`002ef6d0 00000001`3faf1699 chrome+0x354c
00000000`002ef7c0 00000001`3fbcbe33 chrome+0x1699
00000000`002efba0 00000000`775d59cd chrome!IsSandboxedProcess+0x61483
00000000`002efbe0 00000000`7780a561 kernel32!BaseThreadInitThunk+0xd
00000000`002efc10 00000000`00000000 ntdll!RtlUserThreadStart+0x1d
The trap frame is made by nt!KiSystemCall64
, which has a label in it called nt!KiSystemServiceCopyEnd
, which makes the call to the system service, in this case nt!NtWaitForSingleObject
. The trap frame is the first stack frame hence function on the kernel stack and begins at the rsp
(Child-SP) of the function nt!KiSystemCall64
and starts with the homespace that it allocates for the next function. The first instruction of nt!KiSystemCall64
is swapgs
to swap gs
with the gs
in IA32_KERNEL_GS_BASE
, and then it stores rsp
into gs:10h
, which is the UserRsp
field in the KPCR and then stores gs:1A8h
into rsp
, which is the RspBase
field in the KPRCB, this is clearly swapping the user rsp
with the kernel rsp
for the thread, which is cached in the hyperthread that currently has that thread running's KPRCB. It then starts building the trap frame, where the rsp
is currently pointing to offset 0x190
in the trap frame by definition (the byte after the end of the frame), starting with pushing SS
, because the CPU doesn't do this when you use syscall
instead of int 0x2e
.
When a base pointer is used, the trap frame would be at 8 less than the ebp
of the frame above it (subtract ebp
and return address, 4 bytes each), so the frame of nt!NtWaitForSingleObject
Exceptions have an exception frame that stores the non-volatile registers using nt!KiExceptionDispatch
, after nt!KiBreakpointTrap
for instance creates the trap frame (and the trap frame building different to a system call because the processor pushes SS:RSP
RFLAGS
, CS:RIP
and ErrorCode
to the stack for a user-mode exception and RFLAGS
, CS:RIP
and ErrorCode
to the stack for a kernel-mode exception, and does not save/load any rsp
from the KPCR / KPRCB). nt!KiExceptionDispatch
creates an exception frame KEXCEPTION_FRAME
beginning at the next rsp
in the stack trace and the EXCEPTION_RECORD
is immediately after above that on the stack, with a total combined size of 1D8
, so you'll see a difference of 1E0
between the stack pointers of nt!KiExceptionDispatch
and nt!KiBreakpointTrap
when you include the return address to nt!KiBreakpointTrap
.
The trap frame stores all registers that can be clobbered by the kernel
during the array of functions that are called in the kernel, called volatile registers, because the volatile registers are guaranteed not to be changed during a syscall or an exception. It does not save non-volatile registers because the functions that are called in the kernel save any non-volatile registers they use, so by the time it returns to this function to perform a sysret
, all the nonvolatile registers will be the value they were before. You need an exception frame for exceptions because it needs to know in the function that performs the stack unwinding what the nonvolatile registers contained at the time of the exception so that the stack can be unwound. A CONTEXT
structure is created on the stack in nt!KiDispatchException
to represent the full context at the time of the exception, which is a combination of the nonvolatile and volatile register state.
You also need another exception frame when swapping thread contexts in order to save the state of the nonvolatile registers at the time the thread is switched out, so this will be the state in nt!KiSwapContext
, so this can be restored when the thread is switched back in. The volatile state does not need to be saved, because it's assumed that the call to SwapContext
discards all volatile registers. The trap frame created when entering kernel mode in the first place will be used to restore the volatile register context to the thread when it returns to user mode.
Fun fact: for a breakpoint exception INT 3
, the CPU pushes the rip
starting after the end of the breakpoint instruction and not the breakpoint instruction which means that windows needs to decrement the address so the top of the stack callsite is the rip
in the trap frame and not the return address pushed by the CPU.