9

I was trying the issue mentioned at link: https://sourceware.org/ml/libc-alpha/2009-06/msg00168.html

I did some modification in the code as mentioned below:

>> Cat libdep.c
#include <stdio.h>

int duplicate = 'u';

int get_duplicate() {
    printf("libdep sees duplicate as: %c\n", duplicate);
    printf("libdep sees duplicate address as: %x\n", &duplicate);
    return duplicate;
}
--------------------------------------------------------------------------------------

>> Cat  dynamic.c

#include <stdio.h>

  extern int duplicate;

  int run() {
      duplicate = 'd';
      printf("dynamic sees duplicate from libdep as:  %c\n", duplicate);
      printf("dynamic sees duplicate address as: %x\n", &duplicate);

      printf("but libdep sees duplicate from main as: %c\n", get_duplicate());
      return 0;
  }
-------------------------------------------------------------------------------------------------

Cat main.c

#include <stdio.h>
  #include <dlfcn.h>
  #include <stdlib.h>

  extern int duplicate;

  int main() {
      void *h;
      int (*run)();

    duplicate = 'm';

      printf("main sees duplicate as: %c\n", duplicate);
      printf("main sees duplicate address as: %x\n", &duplicate);

      h = dlopen("./dynamic.so", RTLD_LAZY | RTLD_DEEPBIND);
      if (!h)
          abort();

      run = dlsym(h, "run");
      if (!run)
          abort();

    (*run)();
  }

Compiling the above files:

gcc -ggdb3 -shared -fPIC libdep.c -o libdep.so

gcc -ggdb3 -shared -fPIC dynamic.c -Wl,-rpath,. -L. -ldep -o dynamic.so

gcc -ggdb3 main.c -Wl,-rpath,. -L. -ldep –ldl

./a.out

main sees duplicate as: m

main sees duplicate address as: 600ba0

dynamic sees duplicate from libdep as: d

dynamic sees duplicate address as: 5f4fb868

libdep sees duplicate as: m

libdep sees duplicate address as: 600ba0

but libdep sees duplicate from main as: m

See that the same variable has different addresses. And if we remove the RTLD_DEEPBIND from the main.c, the output is as expected.

main sees duplicate as: m

main sees duplicate address as: 600ba0

dynamic sees duplicate from libdep as: d

dynamic sees duplicate address as: 600ba0

libdep sees duplicate as: d

libdep sees duplicate address as: 600ba0

but libdep sees duplicate from main as: d

So My questions are:

When we have the need to use RTLD_DEEPBIND?

How come dynamic.so has different address of duplicate variable when it didn't even had the definition of variable d?

( I tried on gcc 4.2.2 and gcc 4.8.2)

Andrew Brēza
  • 7,705
  • 3
  • 34
  • 40
Ritesh
  • 1,809
  • 1
  • 14
  • 16

1 Answers1

16

You should use RTLD_DEEPBIND when you want to ensure that symbols looked up in the loaded library start within the library, and its dependencies before looking up the symbol in the global namespace.

This allows you to have the same named symbol being used in the library as might be available in the global namespace because of another library carrying the same definition; which may be wrong, or cause problems.

An example of a reason to use it is mentioned on the intel shared math functions page

When one creates a shared library on Linux* that links in the Intel runtime libraries statically (for example, using the -static-intel option), the expectation is that the application will run the optimized version of functions from the Intel-provided libraries, which are statically linked to the shared library. For example, it is expected that calls to math functions like cos() resolve to libimf, the Intel-provided math library which contains optimized math functions. On Linux, the default behavior is to resolve the symbol to the GNU libm version of these routines.

As to the second question - how come we see a different address of the duplicate - well that's a wonderful feature of you building the main application without the -fPIC option. If we use readelf -r on the main application it has a line reading:

000000600d08  001000000005 R_X86_64_COPY     0000000000600d08 duplicate + 0

Note that this has _COPY at the end. This means that when the symbol is found in libdep.so, it gets copied into the initial data segment of the main executable at that address. Then the reference to duplicate in libdep.so is looked up and it points to the copy of the symbol that's in the main executable.

The definition in libdep.so looks like:

0000002009b8  000e00000006 R_X86_64_GLOB_DAT 00000000002009f8 duplicate + 0

i.e. GLOB_DAT - global data.

When you load the dynamic.so, it has it's own request for the symbol. Because you use RTLD_DEEPBIND this definition is looked up in the dependencies of this library first, before looking in the main executable. As a result, it finds and uses the exposed GLOB_DAT from libdep.so and not the exposed data from a.out.

This is directly caused by linking to libdep.so as part of the compilation of dynamic.so. If you hadn't linked to it, then you would see the symbol with the other address - i.e. the copy in the main exe.

Marco A.
  • 43,032
  • 26
  • 132
  • 246
Anya Shenanigans
  • 91,618
  • 3
  • 107
  • 122
  • When you say "You should use RTLD_DEEPBIND when you want to ensure that symbols looked up in the loaded library start within the library, and it's dependencies before looking up the symbol in the global namespace", then it means that dynamic.so should look into itself and ldep.so. And it should have used the same symbol defined inside ldep.so. But we observe 2 different addresses for duplicate. Also if we see, there is only one definition of duplicate across all the files. – Ritesh Dec 03 '15 at 20:20
  • 1
    You're missing the entire second point, which explains the copy semantics, which haul the `duplicate` symbol into the main's initial write data, which is wired to `libdep`'s references at load time of the main exe. There is still a second copy of that data in libdep’s data segment and it is exposed with the same name as the one that is in the main. The real thing to understand is that when the main starts, it starts with a second copy of the variable, because that's what came with the `_COPY` option for the symbol in the executable. – Anya Shenanigans Dec 03 '15 at 20:35
  • Yes you are right...When I read/understand the second point, I accepted the answer immediately :). – Ritesh Dec 03 '15 at 20:40
  • Just got stuck few hours with a similar problem: an executable linked with a shared object statically linked with OpenCV that `dlopen` another shared object statically linked with OpenCV. Without `RTLD_DEEPBIND`, invoking OpenCV code within dlopened shared object after invoking OpenCV in the other one lead my app to do nothing... – david May 02 '22 at 21:11