13

My application uses more than 8 threads. When I run info threads in gdb I see the threads and the last function they were executing. It does not seem obvious to me exactly which thread caused the SIGSEGV. Is it possible to tell it? Is it thread 1? How are the threads numbered?

ks1322
  • 33,961
  • 14
  • 109
  • 164
russoue
  • 5,180
  • 5
  • 27
  • 29
  • I don't think the only answer that question has is the answer to my question. It talks about how to see back traces. – russoue Jan 05 '15 at 20:03
  • 4
    Welcome to SO, @russoue! The fact is that this question *is* a duplicate of that one. You're disappointed that the answer is unclear, and that makes sense. But the way to mitigate that is to add comments and/or bounties to that question. Unfortunately I think the ultimate answer is that you cannot know this exactly and that you must deduce it from the backtrace. – Brian Cain Jan 05 '15 at 20:07

1 Answers1

15

When you use gdb to analyze the core dump file, the gdb will stop at the function which causes program core dump. And the current thread will be the murder. Take the following program as an example:

#include <stdio.h>
#include <pthread.h>
void *thread_func(void *p_arg)
{
        while (1)
        {
                printf("%s\n", (char*)p_arg);
                sleep(10);
        }
}
int main(void)
{
        pthread_t t1, t2;

        pthread_create(&t1, NULL, thread_func, "Thread 1");
        pthread_create(&t2, NULL, thread_func, NULL);

        sleep(1000);
        return;
}

The t2 thread will cause program down because it refers a NULL pointer. After the program down, use gdb to analyze the core dump file:

[root@localhost nan]# gdb -q a core.32794
Reading symbols from a...done.
[New LWP 32796]
[New LWP 32795]
[New LWP 32794]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./a'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000034e4281451 in __strlen_sse2 () from /lib64/libc.so.6
(gdb)

The gdb stops at __strlen_sse2 function, this means this function causes the program down. Then use bt command to see it is called by which thread:

(gdb) bt
#0  0x00000034e4281451 in __strlen_sse2 () from /lib64/libc.so.6
#1  0x00000034e4268cdb in puts () from /lib64/libc.so.6
#2  0x00000000004005cc in thread_func (p_arg=0x0) at a.c:7
#3  0x00000034e4a079d1 in start_thread () from /lib64/libpthread.so.0
#4  0x00000034e42e8b6d in clone () from /lib64/libc.so.6
(gdb) i threads
  Id   Target Id         Frame
  3    Thread 0x7ff6104c1700 (LWP 32794) 0x00000034e42accdd in nanosleep () from /lib64/libc.so.6
  2    Thread 0x7ff6104bf700 (LWP 32795) 0x00000034e42accdd in nanosleep () from /lib64/libc.so.6
* 1    Thread 0x7ff60fabe700 (LWP 32796) 0x00000034e4281451 in __strlen_sse2 () from /lib64/libc.so.6

The bt command shows the stack frame of the current thread(which is the murder). "i threads" commands shows all the threads, the thread number which begins with * is the current thread.

As for "How are the threads numbered?", it depends on the OS. you can refer the gdb manual for more information.

Nan Xiao
  • 16,671
  • 18
  • 103
  • 164