0

I've followed the instructions of the somewhat similar SO question, but the line I get doesn't make sense.

I've an apache process running php compiled with a custom c php extension. Occasionally I'm getting a segmentation fault and a core dump.

This is running on an up to date RHEL 6.5 server.

I've installed the debuginfo rpm for our php extension. Opening the core and getting the bt gives me

(gdb) bt
#0  0x00007f7d7b99dd2f in ?? () from /etc/httpd/modules/libphp5.so
#1  0x00007f7d74783c81 in zif_x12_parse_file () from /usr/lib64/php/modules/x12_parser.so
#2  0x00007f7d7ba093b8 in ?? () from /etc/httpd/modules/libphp5.so
#3  0x00007f7d7b9e0500 in execute () from /etc/httpd/modules/libphp5.so
#4  0x00007f7d7b9ba70d in zend_execute_scripts () from /etc/httpd/modules/libphp5.so
#5  0x00007f7d7b968798 in php_execute_script () from /etc/httpd/modules/libphp5.so
#6  0x00007f7d7ba43d75 in ?? () from /etc/httpd/modules/libphp5.so
#7  0x00007f7d864b4bb0 in ap_run_handler ()
#8  0x00007f7d864b846e in ap_invoke_handler ()
#9  0x00007f7d864c3b30 in ap_process_request ()
#10 0x00007f7d864c09a8 in ?? ()
#11 0x00007f7d864bc6b8 in ap_run_process_connection ()
#12 0x00007f7d864c8977 in ?? ()
#13 0x00007f7d864c8c8a in ?? ()
#14 0x00007f7d864c8fbb in ap_mpm_run ()
#15 0x00007f7d864a0900 in main ()

I’m assuming that the problem is actually in our code, not php at the top of the stack. So I get the location the so is loaded with

(gdb) info shared x12
From                To                  Syms Read   Shared Object Library
0x00007f7d7477ec20  0x00007f7d74784038  Yes         /usr/lib64/php/modules/x12_parser.so

Subtracting the load start location from stack memory location

0x00007f7d74783c81 āˆ’ 0x00007f7d7477ec20 = 5061

And punching that into addr2line

$ addr2line -e /usr/lib/debug/usr/lib64/php/modules/x12_parser.so.debug 0x5061
/usr/src/debug/php-x12_parser-5.3.3/php-x12_parser/x12_parser.c:321

But, line 321 isn't in the zif_x12_parse_file. And, even if it was, isn't a line that I imagine could crash.

So what have I done wrong in calculating the line of the crash?

Community
  • 1
  • 1
Ben Mathews
  • 2,939
  • 2
  • 19
  • 25

3 Answers3

0
  • re-compile the library with -gdb3.
  • then re-run the program (perferably via gdb)
  • look at the stack trace.

Note: those lines that have ?? are from files that were compiled without all the gdb info

Mihai Maruseac
  • 20,967
  • 7
  • 57
  • 109
user3629249
  • 16,402
  • 1
  • 16
  • 17
  • It was already compiled with -g which is the generic version of -gdb. -gdb3 turns up the level of debugging information that was available, but wouldn't solve the problem I was seeing. – Ben Mathews Nov 26 '14 at 17:08
0

Turned out I was making this way to hard. I simply needed to have the debuginfo rpm installed prior to the crash. Now I get nice back traces.

#0  0x00007fc97334fd2f in ?? () from /etc/httpd/modules/libphp5.so
#1  0x00007fc96c135b3b in zif_x12_parse_file (ht=<value optimized out>,return_value=0x7fc98ec278f8, return_value_ptr=<value optimized out>, this_ptr=<value optimized out>, return_value_used=<value optimized out>) at /usr/src/debug/php-x12_parser-5.3.3/php-x12_parser/x12_parser.c:609
#2  0x00007fc9733bb3b8 in ?? () from /etc/httpd/modules/libphp5.so
#3  0x00007fc973392500 in execute () from /etc/httpd/modules/libphp5.so
Ben Mathews
  • 2,939
  • 2
  • 19
  • 25
0

Rather than using the base address from gdb's info sharedlibrary command, use the base address from the process's memory map, /proc/#####/maps.

gdb will keep track of the lowest available address for a shared library's executable address space, generally taking p_vaddr entry in the .text section of the ELF file and relocating it by the base of the mapped region.

Here's an example:

#include <stdlib.h>

main()
{
    char arg[40];
    sprintf(arg, "grep libc /proc/%d/maps", getpid());
    system(arg);
    abort();
}
$ gdb ab
(gdb) run
7ffff7a14000-7ffff7bcf000 r-xp 00000000 08:01 397695 /lib/x86_64-linux-gnu/libc-2.19.so
7ffff7bcf000-7ffff7dcf000 ---p 001bb000 08:01 397695 /lib/x86_64-linux-gnu/libc-2.19.so
7ffff7dcf000-7ffff7dd3000 r--p 001bb000 08:01 397695 /lib/x86_64-linux-gnu/libc-2.19.so
7ffff7dd3000-7ffff7dd5000 rw-p 001bf000 08:01 397695 /lib/x86_64-linux-gnu/libc-2.19.so
Program received signal SIGABRT, Aborted.
0x00007ffff7a4abb9 in __GI_raise (sig=sig@entry=6) at
                      ../nptl/sysdeps/unix/sysv/linux/raise.c:56
(gdb) info sharedlibrary
From                To                  Syms Read  Shared Object Library
0x00007ffff7ddaae0  0x00007ffff7df54e0  Yes        /lib64/ld-linux-x86-64.so.2
0x00007ffff7a334a0  0x00007ffff7b790c3  Yes        /lib/x86_64-linux-gnu/libc.so.6

$ eu-readelf  -S  /lib/x86_64-linux-gnu/libc-2.19.so | egrep 'Name|text'
[Nr] Name      Type         Addr             Off      Size     ES Flags Lk Inf Al
[12] .text     PROGBITS     000000000001f4a0 0001f4a0 00145c23  0 AX     0   0 16

$ printf "%x\n" $(( 0x7ffff7a14000 + 0x1f4a0 ))
7ffff7a334a0

Give addr2line the difference between the desired instruction address and the base of the mapped region:

$ printf "%x\n" $(( 0x00007ffff7a4abb9 - 0x7FFFF7A14000))
36bb9
$ addr2line -e /lib/x86_64-linux-gnu/libc-2.19.so 36bb9
/build/buildd/eglibc-2.19/signal/../nptl/sysdeps/unix/sysv/linux/raise.c:56

Keep in mind that the instruction pointer is likely pointing to the instruction after the one that caused the error, so in some cases (but not in this example) the source line number may be off by 1.

Mark Plotnick
  • 9,598
  • 1
  • 24
  • 40