6

Some preamble

It seems that malloc, calloc, realloc and free are all replicated in ld-linux.so and libc.so . As I understand it, that is done by the dynamic loader to take care of memory management within ld-linux.so before libc.so is loaded and makes its memory management functions aviable. However, I have some questions about those duplicated symbols:

Here's a very simple C program calling malloc and exiting:

#include <stdlib.h>

int main()
{
  void *p = malloc(8);
  return 0;
}

I compile it with gcc in an x86_64 linux box and make some debugging with gdb:

$ gcc -g -o main main.c
$ gdb ./main
(gdb) start
Temporary breakpoint 1 at 0x4004f8
Starting program: main 

Temporary breakpoint 1, 0x00000000004004f8 in main ()
(gdb) info symbol malloc
malloc in section .text of /lib64/ld-linux-x86-64.so.2
(gdb) b malloc
Breakpoint 2 at 0x7ffff7df0930: malloc. (2 locations)
(gdb) info breakpoints
Num     Type           Disp Enb Address            What
2       breakpoint     keep y   <MULTIPLE>         
2.1                         y     0x00007ffff7df0930 in malloc at dl-minimal.c:95
2.2                         y     0x00007ffff7a9f9d0 in __GI___libc_malloc at malloc.c:2910

nm in libc.so and ld.so reveals the following:

$ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep malloc
00000000000829d0 T __libc_malloc
00000000003b6700 V __malloc_hook
00000000003b8b00 V __malloc_initialize_hook
00000000000829d0 T malloc
0000000000082db0 W malloc_get_state
00000000000847c0 T malloc_info
0000000000082480 W malloc_set_state
00000000000844f0 W malloc_stats
0000000000084160 W malloc_trim
00000000000844b0 W malloc_usable_size

$ nm -D /lib64/ld-linux-x86-64.so.2 | grep malloc
0000000000016930 W malloc

Questions

  1. malloc is replicated in libc.so and ld-linux.so but in the case of ld-linux.so it is a weak symbol, so they should both resolve to the same address. Additionally, as I understand it, the dynamic loader's symbol resolution table is global and resolves only one address per symbol (correct me if I'm wrong).

    However, gdb clearly shows otherwise (two different addresses). Why is that?

  2. gdb effectively breaks at two different addresses when typing break malloc but only shows information of a symbol in ld.so when typing info symbol malloc. Why is that?

  3. Although I am breaking at malloc and libc.so defines a malloc symbol of its own (as shown by nm), gdb breaks at symbol __GI___libc_malloc . Why is that?

Community
  • 1
  • 1
fons
  • 4,905
  • 4
  • 29
  • 49
  • If the malloc is replicated indeed, then why would it not show up in symbols and not break at? – Michael Pankov Feb 15 '13 at 10:09
  • @Constantius Quoting myself: *malloc is replicated in libc.so and ld-linux.so but in the case of ld-linux.so it is a weak symbol, so they should both resolve to the same address. Additionally, as I understand it, the dynamic loader's symbol resolution table is global and resolves only one address per symbol (correct me if I'm wrong).* – fons Feb 15 '13 at 10:11
  • What I meant is does malloc in ld have a body (the function is defined there)? – Michael Pankov Feb 15 '13 at 10:34
  • @Constantius ok, now I get what you are getting at, thanks. Yes, most probably it does have a body, and regardless of the final symbol resolution gdb displays it. That *partly* answers some of the questions, but not fully and not all :). – fons Feb 15 '13 at 10:38

2 Answers2

2
  1. I suspect GDB just puts breakpoint on all malloc symbols it can find, "just in case" so to speak. GDB uses its internal symbol table, not the dynamic loader's. This way it can break on non-exported symbols, if you have debug symbols. The command feedback only lists one address probably to reduce noise in case of too many matches. It still mentions "2 locations" so you can inspect it yourself with info breakpoints.
  2. My guess is that info symbol implementer just did not foresee this situation so it prints just the first match
  3. __GI___libc_malloc is the name of the internal, actual implementation of malloc inside libc.so. Since you also get source line info "at malloc.c:2910", I'm guessing it comes from the debug symbols and not from the ELF's symtab. Again, one location can have many names (see __libc_malloc in the symbol list), so GDB just chooses one.

BTW, the malloc's pointer in ld.so's GOT does get replaced by libc's malloc address when libc.so gets loaded (initially it points to the internal implementation). So you do get the same address for both when the process entry point is reached, and ld.so's malloc is not used anymore.

Igor Skochinsky
  • 24,629
  • 2
  • 72
  • 109
  • Understood. Thanks for the detailed explanation. I think that what made it confusing for me (and it could also happen to others I guess) is that, in those situations for which gdb picks one of the possible symbols (like "info symbol" and "print"), it doesn't pick the *global* one (i.e. the one in the ELF loader's symbol table). Could this actually be considered a bug? – fons Feb 21 '13 at 20:02
  • @fons probably not a "bug" per se but it probably can be improved. I'm not sure if GDB currently takes into account the symbol visibility at all. I think an email to GDB's mailing list might help clear things up. – Igor Skochinsky Feb 21 '13 at 23:45
-1

ld.so is not a library, it is the dynamic linker (called implicitly to create the runnable image in memory by linking the executable with shared libraries on program start).

vonbrand
  • 11,412
  • 8
  • 32
  • 52
  • Yes, I am aware that ld.so is the dynamic linker, but it is also a library which is implicitly loaded in the memory space of any binary using dynamic libraries. BTW, you are not responding any of the 3 questions I asked. – fons Feb 14 '13 at 03:50
  • No, `ld.so` is _not_ a library, it is a statically linked executable, read `ld.so(8)`. In fact, that one is for the (now long obsolete) `a.out` format, today the dynamic linker is `ld-linux.so` for ELF. For the other questions, I'm not knowledgeable on the intricacies of glibc symbol magic. – vonbrand Feb 14 '13 at 04:02
  • Nope, it is a shared object, just do `readelf -h /lib64/ld-linux-x86-64.so.2 | grep Type` which also works as an executable . BTW, there is no reference in that manual page (at least not in my distribution) which saying otherwise. – fons Feb 14 '13 at 04:07
  • Additionally, the fact that ld.so (or ld-linux.so for that matter) qualifies as a library or not is besides the point of the 3 questions I made. – fons Feb 14 '13 at 04:14
  • For further reference, ld.so is a self contained shared library [which just happens to have a _start symbol](http://www.cs.virginia.edu/~dww4s/articles/ld_linux.html) (that's why it also works as an executable) but it is a shared library after all. – fons Feb 14 '13 at 04:24
  • It is a shared library and is made an executable to aid in debugging. You can see that by trying to run it as a usual executable. – Michael Pankov Feb 15 '13 at 09:25
  • @vonbrand stop referencing to it and indeed _quote_ the document that says otherwise. Also you can dig in it's sources if you like and discover it has explicit checks is it run as usual program or loaded as shared library by kernel. – Michael Pankov Feb 15 '13 at 14:05