1

I have used 2 threads, but they are getting stuck with following stack trace:

Thread 2:

(gdb) bt
#0 0x00007f9e1d7625bc in __lll_lock_wait_private () from /lib64/libc.so.6
#1 0x00007f9e1d6deb35 in _L_lock_17166 () from /lib64/libc.so.6
#2 0x00007f9e1d6dbb73 in malloc () from /lib64/libc.so.6
#3 0x00007f9e1d6c4bad in __fopen_internal () from /lib64/libc.so.6
#4 0x00007f9e1dda2210 in std::__basic_file<char>::open(char const*, std::_Ios_Openmode, int) () from /lib64/libstdc++.so.6
#5 0x00007f9e1dddd5ba in std::basic_filebuf<char, std::char_traits<char> >::open(char const*, std::_Ios_Openmode) () from /lib64/libstdc++.so.6
#6 0x00000000005e1244 in fatalSignalHandler(int, siginfo*, void*) ()
#7 <signal handler called>
#8 0x00007f9e1d6d6839 in malloc_consolidate () from /lib64/libc.so.6
#9 0x00007f9e1d6d759e in _int_free () from /lib64/libc.so.6

_int_free is getting called as a result of default destructor.

Thread 1:

(gdb) bt
#0 0x00007f9e2a4ed54d in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007f9e2a4e8e9b in _L_lock_883 () from /lib64/libpthread.so.0
#2 0x00007f9e2a4e8d68 in pthread_mutex_lock () from /lib64/libpthread.so.0

Via Threads getting stuck with few threads at point "in __lll_lock_wait" I get to know that __lll_lock_wait() is called if we are not able to get a lock on the mutex, since something else (In this case I guess the Thread 2) is still locking it.

But Thread 2 is also stuck with given stack trace, and since they are not with debug symbols, I can't check who is the owner of the mutex. So my questions are:

  1. What is the use of / cause of __lll_lock_wait_private ()
  2. Is there any hint what and where could the issue be? Without availability of debug symbols.
  3. Several times I have seen hang in case of malloc_consolidate() on linux.. Is this a well known and yet to be solved issue?
Rachid K.
  • 4,490
  • 3
  • 11
  • 30
Shreyans
  • 53
  • 1
  • 9
  • 2
    These are internal structures maintained by the C library. Somewhere in your code there's a bug, or bugs, that corrupt memory, causing hillarity to ensue. You will have to do some debugging, to find them and fix them. – Sam Varshavchik Apr 06 '20 at 12:33
  • @SamVarshavchik. Yes, will look into it. But what is the purpose of ```__lll_lock_wait_private ()``` ? – Shreyans Apr 07 '20 at 06:50
  • 1
    Why does it matter what it is? It's an internal function that implements part of the functionality of locking a mutex. I don't see why it matters. The shown stackframe shows non-thread safe code being invoked from a signal handler. You cannot do that. C++ does not work this way. No, a " hang in case of malloc_consolidate()" is not a "well known" issue, and there is nothing to "solve" here. Your code, which calls C++ library functions, must be fixed so that it does not do that. – Sam Varshavchik Apr 07 '20 at 11:07
  • @SamVarshavchik - I'm the guy who was asking the dumb question about fgets() yesterday. Turns out the problem is this exact problem and fgets() had nothing to do with it (although I still can't explain why moving fgets() out of the if statement fixed it.) One of the threads is doing a new() object call and locking everything up. Your suggestion to use strace and gdb is how I figured it out. The stack trace I have is nearly identical to that in this example. Thanks pal. – Qman Aug 24 '23 at 19:48

1 Answers1

6

Frames 6 and 7 of thread 2 suggest a custom signal handler was installed. Frame 5 suggests it is trying to do something like write to a file (std::ofstream?).

That is not allowed. Very little is allowed in signal handlers, and definitely not iostreams.

Suppose you are in a function like malloc_consolidate which may have to touch the global arena, and take a lock to do it, and a signal comes along. If you allocate memory in the signal handler, you also need the same lock, which is already being held. Thread 2 is deadlocking itself.

Jeff Garrett
  • 5,863
  • 1
  • 13
  • 12
  • Is it not possible that ```malloc_consolidate()``` frees the memory, release the lock, and then that lock only can be used for allocating memory? – Shreyans Apr 07 '20 at 06:49
  • 1
    No, it is not possible. You cannot call most C library functions from signal handlers. They are not thread safe. The End. – Sam Varshavchik Apr 07 '20 at 11:07
  • 1
    While the signal handler runs, malloc_consolidate can do nothing, and cannot release the lock if it held it. – Jeff Garrett Apr 07 '20 at 12:30