relocation entries in a shared lib

Question

I'm investigating relocation of shared libraries, and ran into something strange. Consider this code:

int myglob;

int ml_util_func(int p)
{
    return p + 2;
}

int ml_func2(int a, int b)
{
    int c = ml_util_func(a);
    return c + b + myglob;
}

I compile it to a non-PIC shared lib with gcc -shared. I do this on a 32-bit Ubuntu running on x86.

The resulting .so has a relocation entry for the call to ml_util_func in ml_func2. Here's the output of objdump -dR -Mintel on ml_func2:

0000050d <ml_func2>:
 50d:   55                      push   ebp
 50e:   89 e5                   mov    ebp,esp
 510:   83 ec 14                sub    esp,0x14
 513:   8b 45 08                mov    eax,DWORD PTR [ebp+0x8]
 516:   89 04 24                mov    DWORD PTR [esp],eax
 519:   e8 fc ff ff ff          call   51a <ml_func2+0xd>
                        51a: R_386_PC32 ml_util_func
 51e:   89 45 fc                mov    DWORD PTR [ebp-0x4],eax
 521:   8b 45 0c                mov    eax,DWORD PTR [ebp+0xc]
 524:   8b 55 fc                mov    edx,DWORD PTR [ebp-0x4]
 527:   01 c2                   add    edx,eax
 529:   a1 00 00 00 00          mov    eax,ds:0x0
                        52a: R_386_32   myglob
 52e:   8d 04 02                lea    eax,[edx+eax*1]
 531:   c9                      leave  
 532:   c3                      ret    
 533:   90                      nop

Note the R_386_PC32 relocation on the call instruction.

Now, my question is why is this relocation needed? e8 is "call relative..." on a x86, and since ml_util_func is defined in the same object, surely the linker can compute the relative offset between it and the call without leaving it to the dynamic loader?

Interestingly, if ml_util_func is declared static, the relocation disappears and the linker correctly computes and inserts the offset. What is it about ml_util_func being also exported that makes the linker lazy about it?

P.S.: I'm playing with non-PIC code on purpose, to understand load-time relocations.

@osgx: because, as my questions states in its very first sentence - I'm interested specifically in load-time relocations. I've now added a P.S. to clarify — Eli Bendersky, Aug 20 '11 at 11:38
Can you overload a global symbol in one library with another library (LD_PRELOAD)? — osgx, Aug 20 '11 at 11:43
Eli, there is also an [`attribute((visibility("type")))`](http://gcc.gnu.org/wiki/Visibility) where [type is one of hidden,internal,protected,default](http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html#Function-Attributes) to fine control binding order of any symbol. — osgx, Mar 06 '12 at 14:01

osgx · Accepted Answer · 2011-08-20T13:25:50.150

Can't find why, but this is comment from binutils about this:

binutils-2.11.90-20010705-src.tar.gz/bfd/elf32-i386.c : 679

      /* If we are creating a shared library, and this is a reloc
         against a global symbol, or a non PC relative reloc
         against a local symbol, then we need to copy the reloc
         into the shared library.  However, if we are linking with
         -Bsymbolic, we do not need to copy a reloc against a
         global symbol which is defined in an object we are

I think, this relocation created to allow user overload any global symbol in the library. And, seems that -Bsymbolic disables this ability and will not generate a relocation for symbol from library itself.

http://www.rocketaware.com/man/man1/ld.1.htm

-Bsymbolic This option causes all symbolic references in the output to be resolved in this link-edit session. The only remaining run-time relocation requirements are base-relative relocations, i.e. translation with respect to the load address. Failure to resolve any symbolic reference causes an error to be reported.

Longer description of various -B modes and limitations (C++) is here:

http://developers.sun.com/sunstudio/documentation/ss12/mr/man1/CC.1.html

-Bbinding

           Specifies whether a library binding for linking is
           symbolic, dynamic (shared), or static (nonshared).

           -Bdynamic is the default.  You can use the -B
           option several times on a command line.

           For more information on the -Bbinding option, see
           the ld(1) man page and the Solaris documentation.


           -Bdynamic directs the link editor to look for
           liblib.so files. Use this option if you want
           shared library bindings for linking.  If the
           liblib.so files are not found, it looks for
           liblib.a files.

           -Bstatic directs the link editor to look only for
           liblib.a files. The .a suffix indicates that the
           file is static, that is, nonshared.  Use this
           option if you want nonshared library bindings for
           linking.

           -Bsymbolic forces symbols to be resolved within a
           shared library if possible, even when a symbol is
           already defined elsewhere. For an explanation of
           -Bsymbolic, see the ld(1) man page.

           This option and its arguments are passed to the
           linker, ld.  If you compile and link in separate
           steps and are using the -Bbinding option, you must
           include the option in the link step.

           Warning:

           Never use -Bsymbolic with programs containing C++
           code, use linker scoping instead. See the C++
           User's Guide for more information on linker scop-
           ing. See also the -xldscope option.

           With -Bsymbolic, references in different modules
           can bind to different copies of what is supposed
           to be one global object.

           The exception mechanism relies on comparing
           addresses. If you have two copies of something,
           their addresses won't compare equal, and the
           exception mechanism can fail because the exception
           mechanism relies on comparing what are supposed to
           be unique addresses.

Interesting, there's also this page: http://www.technovelty.org/code/c/bsymbolic.html that explains Bsymbolic. I'll study this, thanks for the pointer — Eli Bendersky, Aug 20 '11 at 12:01
I think Bsymbolic is just a related flag, the real deal here is the search order of symbols - first in the executable, then shared libraries, and the ability to override a global symbol using this search order. LD_PRELOAD is another way. — Eli Bendersky, Aug 21 '11 at 04:05

score 0 · Answer 2 · answered Oct 01 '11 at 11:46

Note that an object is not necessarily a block that is linked in its entirety. There are ways to put symbols in separate sections that can be placed in the final .exe depending on if it is referenced by code. (search for -gc-sections linker option, and related section generation gcc options)

It might be simply not microoptimizing this when no sections are used.

relocation entries in a shared lib

2 Answers2