0

A call to malloc is returning a pointer to a block of memory that overlaps memory already allocated by make_shared. I'm building a FUSE client, and the malloc call is in the FUSE library, but I'm not sure that's relevant. I wasn't able to reproduce the error outside my program, and I've got no idea what to do next. Valgrind doesn't find any errors until a pointer in the object managed by the shared_ptr is corrupted and then used.

br1ckd
  • 556
  • 3
  • 15
  • 1
    Have you tried running it under `valgrind`? Something like that sounds possible if `malloc()`'s internal data structures become corrupted. – FatalError Apr 18 '13 at 01:50
  • Yes, no errors found until a pointer is corrupted and then used. – br1ckd Apr 18 '13 at 01:54
  • Are you sure the shared pointer hasn't released ownership of its object? This will cause the object to be deleted, and another `malloc` call can reuse that memory. – Barmar Apr 18 '13 at 02:02
  • There are actually 2 shared_ptrs for the object when it's overwritten, one is on the stack and the other is in a vector. – br1ckd Apr 18 '13 at 02:06
  • 1
    In decreasing order of likelyhood, here are my guesses: (1) what you are claiming is wrong, you have misunderstood the situation. (2) you are somehow getting two memory allocation systems both claiming the same memory through a hellish set of libraries in use, possibly because they refer to two different address spaces. (3) you found a bug in your compiler and runtime. (4) Cosmic rays are consistently flipping bits in your computer to cause unexpected behavior. – Yakk - Adam Nevraumont Apr 18 '13 at 02:49
  • It seems like #1 is by far the likeliest scenario, but when I reproduce the bug in gdb it seems very obvious that this is what's happening. I've stepped through the code, from the malloc call to the point where my shared_ptr object is corrupted. malloc is called to allocate a 255 byte block. It returns `0x6da2d0` and the shared_ptr in the vector is pointing to `0x6da2e8`. I'll try again to isolate the error so I can post some sample code, but I'm not sure if I'll be able to. Is there another approach to debugging I could use? – br1ckd Apr 18 '13 at 04:38
  • Is it possible malloc's internal data structures were corrupted and valgrind didn't catch it with these arguments? `valgrind --tool=memcheck --leak-check=yes --show-reachable=yes --num-callers=20 --track-fds=yes` – br1ckd Apr 18 '13 at 04:41
  • Have you considered running memtest86 for 24 hours or so? Memory errors can cause cosmic rays to consistently flip bits in your computer, which can cause all kinds of problems, some relatively small like this and others relatively large like boot sector corruption. – autistic Apr 18 '13 at 04:42

1 Answers1

0

This bug was the result of creating a shared_ptr with new, then typecasting it to a weak_ptr and deleting it. I have to typecast it because I'm using a C library (FUSE) and passing it a pointer to a weak_ptr, and it provides a uint64_t to store a handle. The library then calls my functions and passes them a struct containing the pointer typecast as a uint64_t.

br1ckd
  • 556
  • 3
  • 15