The Symbol Relocation

Question

The following is how a function call(for the 1st time) would be resolved in a PIC

Jump to the PLT entry of our symbol.
Jump to the GOT entry of our symbol.
Jump back to the PLT entry and push an offset on the stack. That the offset is actually an Elf_Rel structure describing how to patch the symbol.
Jump to the PLT stub entry.
Push a pointer to a link_map structure in order for the linker to find in which library the symbol belongs to.
Call resolver routine.
Patch the GOT entry.

This is different from how a data reference is made which just uses the GOT table

So, why is there this difference? Why 2 different approaches?

Please don't group unrelated questions. it doesn't cost you extra to ask two separate ones. — Employed Russian, May 29 '18 at 04:36
I recently asked a question(related to this topic only) and got no response. Could you please take a look at it? — ray an, May 29 '18 at 08:42
https://stackoverflow.com/questions/50453228/symbol-resolution-and-dynamic-linking — ray an, May 29 '18 at 08:42
As far as the second question is concerned I will be making a new post for it as you suggested. Thnx — ray an, May 29 '18 at 08:45

score 2 · Accepted Answer · answered May 29 '18 at 04:26

why is there this difference? Why 2 different approaches?

What you described is lazy relocation.

You don't have to use it, and will not use it if e.g. LD_BIND_NOW=1 is set in the environment.

It's an optimization: it allows you to reduce the amount of work that the dynamic linker has to perform, when a particular program invocation does not exercise many possible program execution paths.

Imagine a program that can call foo(), bar() or baz(), depending on arguments, and which calls exactly one of the routines in any given execution.

If you didn't use lazy relocation, the dynamic loader would have to resolve all 3 routines at program startup. Lazy relocation allows dynamic loader to only perform the one relocation that is actually required in any given execution (the one function that is getting called), and at exactly the right time (when the function is being called).

Now, why can't variables also be resolved that way?

Because there is no convenient way for the dynamic loader to know when to perform that relocation.

Suppose the globals are a, b and c, and that foo() references a and b, bar() references b and c, and baz() references a and c. In theory the dynamic loader could scan bodies of foo, bar and baz, and build a map of "if calling foo, then also resolve globals a and b", etc. But it's much simpler and faster to just resolve all references to globals at startup.

The Symbol Relocation

1 Answers1