0

My managed process is suspected to have caused a BSOD at a client site. I received a full memory dump (i.e.: including kernel, physical pages only) - but still am not able to inspect my process' stacks.

After switching to my process context -

.process /p /r <MyProcAddress>

I see only -

1: kd> k
 # ChildEBP RetAddr  
00 b56e3b70 81f2aa5d nt!KeBugCheckEx+0x1e
01 b56e3b94 81e7b68d nt!PspCatchCriticalBreak+0x71
02 b56e3bc4 81e6dfd1 nt!PspTerminateAllThreads+0x2d
03 b56e3bf8 8d48159a nt!NtTerminateProcess+0xcd
WARNING: Stack unwind information not available. Following frames may be wrong.
04 b56e3c24 81c845e4 klif+0x7559a
05 b56e3c24 77da6bb4 nt!KiSystemServicePostCall
06 0262f34c 00220065 ntdll!KiFastSystemCallRet
07 0262f390 003e0022 0x220065
08 0262f394 0073003c 0x3e0022
09 0262f398 00730079 0x73003c
0a 0262f39c 006e003a 0x730079
0b 0262f3a0 006d0061 0x6e003a
0c 0262f3a4 00200065 0x6d0061
0d 0262f3a8 00610076 0x200065
0e 0262f3ac 0075006c 0x610076
0f 0262f3b0 003d0065 0x75006c
10 0262f3b4 00770022 0x3d0065
11 0262f3b8 006e0069 0x770022
12 0262f3bc 006f0077 0x6e0069
13 0262f3c0 00640072 0x6f0077
14 0262f3c4 00200022 0x640072

...

Which is natural for managed process. SOS extension does not work for kernel dumps.

Is there anything I can do to view the throwing managed stack? It was previously said to be 'much more difficult', but hopefully not impossible.


PS. I'm aware of the presence of Kaspersky driver kilf.sys in the stack, and this is my personal suspect. But the question is more general - hopefully there's a way to understand what my process was doing at the time.

Ofek Shilon
  • 14,734
  • 5
  • 67
  • 101
  • 2
    the stack serms to be overwritten 0x22 0x00 0x65 0x00 etc do not look like address but an unicode string 65 corresponds to e 22 corresponds to " double quote dump the address like db esp and see if you can find clues – blabb Aug 12 '18 at 15:35
  • 1
    Maybe check the cause of the BSOD first, then check whether your client process may have triggered it. – Thomas Weller Aug 12 '18 at 16:24
  • @ThomasWeller I'm investigating my client process to try and understand the BSOD, of course. – Ofek Shilon Aug 13 '18 at 08:49
  • The causality does not seem right. Why are you so sure that your process is involved in the BSOD? Applications normally cannot do that. They run in user mode. Only kernel mode stuff can cause a BSOD. – Thomas Weller Aug 13 '18 at 11:14
  • @ThomasWeller sadly, my application does need to fiddle with csrss hooks. – Ofek Shilon Aug 13 '18 at 15:49

1 Answers1

1

the stack as you posted is not correct
it appears to be overwritten or is a result of some other artefact
with such a stack details you will have a hard time deciphering
anything useful at all

the contents of stack converted to a printable range in english looks like this

Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000000  00 3E 00 22 00 22 00 65 00 73 00 3C 00 3E 00 22  .>.".".e.s.<.>."
00000010  00 73 00 79 00 73 00 3C 00 6E 00 3A 00 73 00 79  .s.y.s.<.n.:.s.y
00000020  00 6D 00 61 00 6E 00 3A 00 20 00 65 00 6D 00 61  .m.a.n.:. .e.m.a
00000030  00 61 00 76 00 20 00 65 00 75 00 6C 00 61 00 76  .a.v. .e.u.l.a.v
00000040  00 3D 00 65 00 75 00 6C 00 77 00 22 00 3D 00 65  .=.e.u.l.w.".=.e
00000050  00 6E 00 69 00 77 00 22 00 6F 00 77 00 6E 00 69  .n.i.w.".o.w.n.i
00000060  00 64 00 72 00 6F 00 77 00 20 00 22 00 64 00 72  .d.r.o.w. .".d.r

try !analyze -v and see what is the bsod analysis results

blabb
  • 8,674
  • 1
  • 18
  • 27
  • Thanks. !analyze -v gives little information (the direct symptom is a csrss unhandled exception). You're probably right and this is not the stack - but I don't think it is corrupted: I believe when the debugger complains 'Following frames may be wrong' it is dead serious. It was an x86 machine, and the lack of symbols for the kaspersky driver might make stack walking virtually impossible (FPO?). – Ofek Shilon Aug 13 '18 at 08:35
  • Correction: this indeed isn't a stack, but not due to FPO or corruption. Quoting this answer [https://stackoverflow.com/a/1918944/89706] : "KiFastSystemCallRet means that the thread is in a syscall - an unfortunate aspect of x86 NT syscall dispatch is that it will not return the context back to the original place, but has to return to a static location in ntdll, which will fix up the context and put you back where you came from." I wonder if I should fix the question title to reflect this understanding... – Ofek Shilon Aug 13 '18 at 08:45