more intuitive backtrace of valgrind memcheck in c++ program?

Question

I obtained the following output after running my c++ program in debug mode with

valgrind --tool=memcheck --leak-check=full ./my_program

==1904766== 
==1904766== HEAP SUMMARY:
==1904766==     in use at exit: 209,434 bytes in 1,309 blocks
==1904766==   total heap usage: 871,805 allocs, 870,496 frees, 76,151,918 bytes allocated
==1904766== 
==1904766== 896 bytes in 2 blocks are possibly lost in loss record 1,289 of 1,302
==1904766==    at 0x4857A83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1904766==    by 0x40147D9: calloc (rtld-malloc.h:44)
==1904766==    by 0x40147D9: allocate_dtv (dl-tls.c:375)
==1904766==    by 0x40147D9: _dl_allocate_tls (dl-tls.c:634)
==1904766==    by 0x25379834: allocate_stack (allocatestack.c:430)
==1904766==    by 0x25379834: pthread_create@@GLIBC_2.34 (pthread_create.c:647)
==1904766==    by 0x2508F388: std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30)
==1904766==    by 0x14686EA2: std::thread::thread<void (std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<std::function<void ()> > >, void>::*)(), std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<std::function<void ()> > >, void>*, void>(void (std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<std::function<void ()> > >, void>::*&&)(), std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<std::function<void ()> > >, void>*&&) (std_thread.h:142)
==1904766==    by 0x14686AB4: std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<std::function<void ()> > >, void>::_Async_state_impl<std::function<void ()> const&>(std::function<void ()> const&) (future:1731)
==1904766==    by 0x1468660C: void std::_Construct<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<std::function<void ()> > >, void>, std::function<void ()> const&>(std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<std::function<void ()> > >, void>*, std::function<void ()> const&) (stl_construct.h:119)
==1904766==    by 0x1468627F: void std::allocator_traits<std::allocator<void> >::construct<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<std::function<void ()> > >, void>, std::function<void ()> const&>(std::allocator<void>&, std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<std::function<void ()> > >, void>*, std::function<void ()> const&) (alloc_traits.h:635)
==1904766==    by 0x14685DC0: std::_Sp_counted_ptr_inplace<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<std::function<void ()> > >, void>, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<std::function<void ()> const&>(std::allocator<void>, std::function<void ()> const&) (shared_ptr_base.h:604)
==1904766==    by 0x14685830: std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<std::function<void ()> > >, void>, std::allocator<void>, std::function<void ()> const&>(std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<std::function<void ()> > >, void>*&, std::_Sp_alloc_shared_tag<std::allocator<void> >, std::function<void ()> const&) (shared_ptr_base.h:971)
==1904766==    by 0x146853C1: std::__shared_ptr<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<std::function<void ()> > >, void>, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<void>, std::function<void ()> const&>(std::_Sp_alloc_shared_tag<std::allocator<void> >, std::function<void ()> const&) (shared_ptr_base.h:1712)
==1904766==    by 0x1468439E: std::shared_ptr<std::__future_base::_Async_state_impl<std::thread::_Invoker<std::tuple<std::function<void ()> > >, void> >::shared_ptr<std::allocator<void>, std::function<void ()> const&>(std::_Sp_alloc_shared_tag<std::allocator<void> >, std::function<void ()> const&) (shared_ptr.h:464)
==1904766== 
==1904766== 1,792 bytes in 4 blocks are possibly lost in loss record 1,292 of 1,302
==1904766==    at 0x4857A83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1904766==    by 0x40147D9: calloc (rtld-malloc.h:44)
==1904766==    by 0x40147D9: allocate_dtv (dl-tls.c:375)
==1904766==    by 0x40147D9: _dl_allocate_tls (dl-tls.c:634)
==1904766==    by 0x25379834: allocate_stack (allocatestack.c:430)
==1904766==    by 0x25379834: pthread_create@@GLIBC_2.34 (pthread_create.c:647)
==1904766==    by 0x24DE0E1C: rml::internal::thread_monitor::launch(void* (*)(void*), void*, unsigned long) (thread_monitor.h:218)
==1904766==    by 0x24DE13D4: tbb::internal::rml::private_worker::wake_or_launch() (private_server.cpp:297)
==1904766==    by 0x24DE0AE1: tbb::internal::rml::private_server::wake_some(int) (private_server.cpp:395)
==1904766==    by 0x24DE11AE: tbb::internal::rml::private_server::propagate_chain_reaction() (private_server.cpp:157)
==1904766==    by 0x24DE042C: tbb::internal::rml::private_worker::run() (private_server.cpp:257)
==1904766==    by 0x24DE02F3: tbb::internal::rml::private_worker::thread_routine(void*) (private_server.cpp:219)
==1904766==    by 0x25378B42: start_thread (pthread_create.c:442)
==1904766==    by 0x25409BB3: clone (clone.S:100)
==1904766== 
==1904766== LEAK SUMMARY:
==1904766==    definitely lost: 0 bytes in 0 blocks
==1904766==    indirectly lost: 0 bytes in 0 blocks
==1904766==      possibly lost: 2,688 bytes in 6 blocks
==1904766==    still reachable: 206,746 bytes in 1,303 blocks
==1904766==         suppressed: 0 bytes in 0 blocks
==1904766== Reachable blocks (those to which a pointer was found) are not shown.
==1904766== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==1904766== 
==1904766== For lists of detected and suppressed errors, rerun with: -s
==1904766== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

For instance, the backtrace associated with "896 bytes in 2 blocks are possibly lost in loss record..." is not very describtive to me. Is there a way to link this backtrace to the function/place in my code?

You might want to [post text as text](https://meta.stackoverflow.com/q/285551/509868) and not as a picture — anatolyg, Mar 29 '23 at 16:20
address sanitizer might give better output and isn't as slow either — Alan Birtles, Mar 29 '23 at 16:21
@AlanBirtles Might be worth to check out, but can you help with interpreting the above output? — Simon, Mar 29 '23 at 16:51
It looks like your stack traces are truncated. Does [this answered question](https://stackoverflow.com/q/11242795/509868) help? — anatolyg, Mar 29 '23 at 17:27
@anatolyg Only partially. Adding --num-callers=500 gives an expanded backtrace for "896 blocks...", however, the backtrace associated with "1,792 bytes in 4 blocks..." is identical, i.e., the backtrace is already completely shown in my question. So what does "by 0x25409BB3: clone (clone.S:100)" mean? Is it something I should worry about? — Simon, Mar 29 '23 at 21:39
@AlanBirtles maybe, but if you need to "build world" so that all your dependencies are also sanitized then good luck with that. — Paul Floyd, Mar 30 '23 at 09:31

Paul Floyd · Accepted Answer · 2023-03-30T09:35:43.967

2

Valgrind has no way of knowing what is system code and what is user code.

It uses several techniques to generate stack traces, and at gets the file name and line number information from DWARF.

You need to spend more time learning how to read and interpret stack traces - the same thing applies to backtraces obtained with debuggers and pstack.

"896 bytes in 2 blocks" what don't you understand there? 896 bytes were allocated in two calls to the allocation function.

"possibly lost" OK you need to read the manual for that. The reports cover more than just the case where a pointer to allocated memory has gone out of scope. See the manual or this article.

"loss record" that doesn't mean much for you as a user, it's just a count of all the leaks.

Reading the callstack. "calloc" that's your allocation function. I hope that all developers know that. "rtld" and "dl-tls" you need to learn these. They are related to the link loader. The link loader is the shared library responsible for loading other shared libraries that your exe uses and for resolving global data and functions. TLS is the thread local storage. "pthread_create" is fairly obvious, the function to create a new thread. "tbb" that's Intel Thread Building Blocks library that you are using directly or indirectly. "std::" all that is stuff in the C++ standard library, again all developers should know that.

There are also situations where there is no obvious link between the leak and your code. Examples of this might be globals, file and function statics.

What I read from all that is you have a leak that is caused by the stack that is allocated for new threads.

Can you tell if you are using detached or joinable threads? If you are using "detached" threads then leaks like this may be inevitable and you can't do anything about it. If your threads are joinable, you may be missing the join that does the resource releasing.

edited Mar 30 '23 at 09:35

answered Mar 30 '23 at 07:46

Paul Floyd

5,530
5
29
43

The function I call probably uses detached threads. I called valgrind because I observed that my program has undefined behavior: Adding a print statement changes the output of my program. The above call to valgrind does not suggest that there are some memory issues if we forget about the leaks caused by the thread allocation, right? – Simon Mar 30 '23 at 08:43
Leaks like this are fairly harmless . As long as you're not creating large numbers of detached threads then you should be OK. – Paul Floyd Mar 30 '23 at 09:05
Well, if I see that adding a print statement changes the output of my program in release mode but not in debug mode, how would you debug this? I mean memory check of valgrind, for instance, makes only sense if running in debug mode, right? – Simon Mar 30 '23 at 09:35
You can also try thread hazard detectors (thread sanitizer, Valgrind DRD and Valgrind Helgrind). I've seen cases where the extra millisecond delay caused by printf cases a race condition to change. – Paul Floyd Mar 30 '23 at 09:39
I will try that. If I run the program with or without the print statement repeatedly, the output is always the same (deterministic). Should this extra second you are referring to not possibly also change the output among several runs with the same settings? – Simon Mar 30 '23 at 09:44
These things tend to be fairly random. Adding some delay (like a printf) can make things more probable. – Paul Floyd Apr 04 '23 at 14:15
I deactivated multi-threading in my program and also let it run with the address and undefined behavior sanitizers of clang -- no bugs were detected. How would you proceed to debugging if all of these tools do not detect a bug? – Simon Apr 04 '23 at 14:30
Does thread sanitizer say anything? – Paul Floyd Apr 04 '23 at 20:03
No, program runs through without any warnings/errors – Simon Apr 05 '23 at 09:06

more intuitive backtrace of valgrind memcheck in c++ program?

1 Answers1