0

Edit: I appear to have been mistaken, the backtrace works wonderfully from anywhere on Linux -- it is only when remote debugging from gdb on ubuntu to remote windows that the stacktrace gets absolutely destroyed after entering one of the memory allocation functions in msvcrt... dammit microsoft.

And this happens for both 64bit and 32bit windows, so I'm not sure this is related to the unwind information...

Edit: It appears adding -g3 and -Og has helped with part of the issue in some programs but the problem still persists in other programs, cannot post their source here as it is IP of my company -- sorry!

Background

I am using gcc to compile ubuntu->ubuntu and mingw to compile ubuntu->windows.

I have created a cross platform (linux + windows) memory tracking & leak detection library which hooks malloc/calloc/realloc/free with an assembly bytepatch on the first instructions (not IAT/PLT hooking).

The hook redirects to a gate which checks if the hooks are enabled in the current thread and redirects to the memory tracking hook function if they are, otherwise it just redirects to the trampoline of the real function if they are disabled for that thread.

The library works great and detects leaks on linux/windows (probably would work on mac but I don't have one).

I use the library to programmatically detect leaks from within my code, I can install callbacks on the memory allocation routines and programmatically raise breakpoints (by looping and waiting for debugger to attach then executing asm("int3")) inside the callbacks so that I can attach to my program while it's inside of a call that leaks memory.

Everything works great up until I try to view a backtrace from within my callback, I understand this is is probably because the unwind information is probably not matching my stack anymore because I have inserted new frames and data via the hook routines I have inserted.

Edit: If I am mistaken about the unwind info mismatching the stack being the cause of the incorrect backtrace then please correct me!

The Question

Is there any small hacks I can do to trick GDB into correctly rebuilding the backtrace from within my hook callbacks?

I understand that I can manually walk and edit the unwind info with libdwarf or something but I imagine that would be incredibly cumbersome and large.

So I am wondering if perhaps there is a hack or a cheat I can do which would trick GDB into properly rebuilding the backtrace?

If there are no easy hacks or tricks then what are all of my options for fixing this issue?

Edit: Just to clear up the exact call order of everything:

program
   V
malloc
   V
hook_malloc -> hooks are disabled -> return malloc trampoline -> real malloc > program
   V
hooks are enabled 
   V
Call original malloc -> malloc trampoline -> real malloc -> returns to hook
   V
Record memory size/info etc from malloc
   V
Call user defined callback -> **User defined callback* -> returns to hook
   V
return to program

It is the "User Defined Callback" where I want to capture a backtrace

user6567423
  • 353
  • 4
  • 9
  • If you leave the trampoline via a tail call then it will become 'transparent' to any stack tracer. That would be simplest if possible in your case. – zch Nov 09 '18 at 19:34
  • @zch I've tried to add a tree to my original post which explains the order of calls, you will see that the trampoline is only called when the original malloc needs to be called. The place I want to backtrace is in the User Defined Callback. First malloc is redirected to hook_malloc, then hook_malloc calls original malloc to allocate memory, then hook_malloc calls a user defined malloc callback (which I install from my code when I init the library). It is here I want to create a backtrace but hook_malloc and the user defined callback have significantly shifted the stack by that point. – user6567423 Nov 09 '18 at 19:47
  • 1
    At this break point do you have any dwarf-less frames on the stack? If so, you could jump (tail call) into a C function removing this frame from the stack, as if it was never there. This C function would do the core work, including calling any other functions, including user-supplied functions. – zch Nov 09 '18 at 20:18
  • Yes the hook_malloc would have created a frame and the user defined callback would have created a frame, both of which wouldn't have DWARF information for unwinding. So you're saying if I tail-call out of the user-defined-callback in order to drop those frames that I should be able to generate a backtrace? I will have to see if I can get this to work :) – user6567423 Nov 09 '18 at 20:39

1 Answers1

-1

Apparently this is the same problem GDB Windows ?? in Backtraces

And the solution was to simply add -g3 to the mingw compile flags and viola I have non-broken backtraces!

Edit: Nevermind, this isn't the whole answer. It appears like this fix worked for some test programs, but other programs still appear to show incorrect backtraces like:

(gdb) bt
#0  malloc_callback (s=38, rv=0x2c5058) at test_dll.c:729
#1  0x000000000040731d in hook_malloc_raw (file=0x410ea1 <__FUNCTION__.63079+55> "", function=0x410ea1 <__FUNCTION__.63079+55> "", line=0, s=38, rv=8791758343065)
#2  0x0000000000407367 in hook_malloc (s=38)
#3  0x000007fefda20b9e in ?? ()
#4  0x0000000000000026 in ?? ()
#5  0x0000000000410ea1 in __FUNCTION__.63079 ()
#6  0x0000000000000000 in ?? ()

Obviously Frame #4 isn't actually a stack frame, and I'm not sure why frame #5 is labeled "__FUNCTION__.63079".

Edit2: If people are going to downvote this at least leave a comment saying why

user6567423
  • 353
  • 4
  • 9