4

I have a primitive understanding of how the linker does dead-code elimination of unused functions and data segments. If you use the proper compiler and linker flags it puts each function and data member into it's own section, then when the linker goes to link them it will see that, if not referenced directly, nothing links into that section and then it will not link that section into the final elf.

I'm trying to reconcile how that works with function pointers. You could, for example, have a function pointer whose value is based on user input. Probably not a safe thing to do, but how would the compiler and linker handle that?

NickHalden
  • 1,469
  • 2
  • 20
  • 31
  • 2
    That is not garbage collection, it is usually called `dead code elimination` http://en.wikipedia.org/wiki/Dead_code_elimination . – siritinga Oct 28 '14 at 20:39
  • Read also wikipage on [garbage collection](http://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29). Actually, GCC has an *internal* GC (quite poor IMNSHO; [MELT](http://gcc-melt.org/) has a better one, and runs inside `gcc`), called `ggc`, to deal with internal data at compile time. There is not direct relation with your question. – Basile Starynkevitch Oct 28 '14 at 20:41
  • Ah, sorrry about the terminology confusion. The linker flag you pass to gcc is "--gc-sections" so that's a little misleading. I updated the question. The scenario posed in my question still stands though. – NickHalden Oct 28 '14 at 20:43

1 Answers1

4

There is no portable way to assign a function pointer without making an explicit reference to the function (for example you cannot use pointer arithmetic on function pointers).

So every function that is reachable from your program must also be named and referenced in the code and the linker will know about it. Just even storing the function pointer in an array like in:

typedef void (*Callback)();
Callback callbacks[] = { foo, bar, baz };

is enough to ensure that the functions listed will be included in the linked executable (the array content will be fixed at load time or at link time depending on the platform).

6502
  • 112,025
  • 15
  • 165
  • 265
  • Interesting, did not know that. Can you link me to some resource explaining why you cannot assign a function pointer with math or any other means other than explicit reference to function? Also, what about data segments? I know you can make a data pointer using pointer arithmetic... – NickHalden Oct 28 '14 at 20:45
  • On most POSIX operating systems (e.g. Linux) you could load a plugin at runtime using `dlopen` and fetch some function symbol in it with  `dlsym`; such functions are not part of the *initial* text of the program. – Basile Starynkevitch Oct 28 '14 at 20:51
  • 2
    @NickHalden Ask yourself how you would go about defining a function pointer using arithmetic. Pointer arithmetic for data pointers relies on the compiler knowing the size of the data type. There is no way of knowing the size of the compiled functions, or their order in memory. It just turns into a logistical impossibility. – Degustaf Oct 28 '14 at 20:56
  • 1
    This answer is incorrect. It is in fact allowed to assign arbitrary values to function pointers or to cast integers to function pointers, but calling the resulting function pointer might be undefined behavior. – fuz Oct 28 '14 at 22:25
  • @FUZxxl: It's correct in the context of the question. Of course there are other ways to get a valid function pointer like the comment on dlopen/dlsym explains, however code removal at link time is not a problem for function pointers because there's no portable way to get a pointer to a function that hasn't been referenced explicitly in the source code. Note also that you can get UB even for just setting a pointer with an invalid value (no need to call the function); there is hardware for which just setting a bad value in a pointer may trigger an hardware trap. – 6502 Oct 28 '14 at 22:36
  • If that's what you tried to convey, please consider rewording your answer. It is quite misleading the way it is formulated right now. – fuz Oct 29 '14 at 07:30
  • @FUZxxl: I added "portable" in the wording even if I think that the answer was clear enough anyway (but of course understanding depends on the reader so may be you are right). Seems also to me we don't agree on what's the meaning of "allowed"... it seems that by your definition calling `printf` with wrong parameter types, dereferencing a null pointer or writing outside the bounds of an array is indeed allowed. – 6502 Oct 29 '14 at 08:44
  • So POSIX isn't portable for you? It specifies [dlsym](http://pubs.opengroup.org/onlinepubs/9699919799/functions/dlsym.html) to get the address of a symbol, which could be a function. – fuz Oct 29 '14 at 09:06
  • Before we lost somewhere - if function is marked as hidden, it will not be found by `dlsym`. On windows, it must be even explicitly marked with `__declspec(dllexport)`. Other interesting thing is static libraries - their functions are getting thrown out if not used, unless `--whole-archive` linker option present. – keltar Oct 29 '14 at 09:59