-1

On MacOS Ventura, obtaining a handle to the dynamic loader using dlopen(NULL, 0) returns a handle containing the entire executable's symbol table. Using this handle, one can obtain pointers to symbol data, access and modify their contents, and these changes will permeate across the program. However, attempting this with functions pointers does not work the same way.

For example, the following code:

#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>

int my_var = 0;

int main() {
void *handle = dlopen(NULL, 0);
int *a = dlsym(handle, "my_var");
*a = 5;
printf("%d", my_var);
return 0;
}

will print 5 instead of 0. However, when attempting something similiar with function pointers:

#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>

typedef void (*ftype)(void);

void fun1() {
printf("a");
}

void fun2() {
printf("b");
}

int main() {
void *handle = dlopen(NULL, 0);
ftype f1ptr;
 *(void **)(&f1ptr) = dlsym(handle, "fun1");
f1ptr = fun2;
fun1();
return 0;
}

will print a instead of b. Is there anyway to perform an operation similar to the first code segment with function pointers? Essentially, I would like fun1 to now point to the code that fun2 points to. Swapping f1ptr's type to "ftype *" and then performing "*f1ptr = fun2" causes a bus fault.

jungon
  • 41
  • 1
  • 6
  • Maybe call the function via the pointer? f1ptr(); (you are calling fun1 directly) You can't change the executable code once it's been loaded (because virus writers would love it if you could). – robthebloke Jan 04 '23 at 06:11
  • i see why you cannot change the executable code, but i would really just like the symbol "fun1" to refer to the same symbol table contents as the symbol "fun2" – jungon Jan 04 '23 at 06:20
  • The executable code is in the same block of memory as the symbol table. It is read-only. Exported variables are handled differently. Typically behind the scenes, most OS's will allocate some memory, and in the case above, splat the whole lot with a memset(0). The symbol pointer you request for that var will point into that allocation. That is why you can modify the variable. In theory you could mmap the base address with PROT_WRITE, and manually step to the function table. That might work on a microcontroller, but will probably throw SIGSEGV on macOS if you attempt to write there. – robthebloke Jan 04 '23 at 06:49
  • That makes sense. Is there any way to dereference f1ptr and set that value to fun2? – jungon Jan 04 '23 at 06:53
  • The type of `f1ptr` _dereferenced_, i. e. a dereferenced pointer to a function, is a function; a function has no value. – Armali Jan 04 '23 at 11:42
  • 1
    Your examples don't do the same thing. In your first program you change the _value_ of `my_var`, not the address of it. Since the code of `fun1` is read-only, you cannot change it. -- So, what do you really want to achieve? Please note that the call `fun1()` does _not use_ the entry in the symbol table to find the entry point. – the busybee Jan 04 '23 at 12:43
  • OT: Your second example is erroneous: `ftype` is a variable, not a data type. You need to write `typedef` in front of it to correct. – the busybee Jan 04 '23 at 16:09
  • @thebusybee " Please note that the call fun1() does not use the entry in the symbol table to find the entry point." I did not know about this. If it does not use the symbol table, then how does the program locate the entry point and why does it record the symbol fun1 in the symbol table? – jungon Jan 04 '23 at 17:38
  • The symbol table exists only if debug information is stored in the file. (Exceptions prove the rule.) A stripped executable file has no symbol table. -- Since functions are static "objects" (functions are no first class objects in C), the linker as the last building phase resolves such references. You might want to do some research on the matter, a comment is the wrong place to tell you more, other than that this is a complete other issue as your question. – the busybee Jan 04 '23 at 19:03

1 Answers1

2

However, attempting this with functions pointers does not work the same way.

An identifier names an object or a function. The identifier for a function is not a function pointer, and your example does not show the code working differently for objects or functions.

In int *a = dlsym(handle, "my_var");, your code obtains a pointer to my_var. Then it uses *a = 5; to change the value of my_var.

In *(void **)(&f1ptr) = dlsym(handle, "fun1");, your code obtains a pointer to fun1 (although it mishandles it, discussed below). So f1ptr is a pointer to fun1. It is a pointer to the function, not a pointer to a function pointer. If functions were somehow modifiable, then *f1ptr = …; would modify the function.

However, you do not use *f1ptr = …;. You use f1ptr = fun2;. This simply changes f1ptr. It does not change what f1ptr points to. It does not change fun1.

fun1 is an identifier for the function. It is not a pointer. So there is no pointer to change. Nothing can be done in the program to make fun1 be a different function or point to a different function.

Regarding why *(void **)(&f1ptr) = dlsym(handle, "fun1"); is wrong, this says to take the address of f1ptr, convert the address to a void **, and to use that memory location as if it were a void *. Then that memory location is assigned the value of the void * returned by dlsym. In other words, you are accessing the pointer-to-function object f1ptr using the type pointer to void. That violates C 2018 6.5 7, which says that an object shall be accessed only with its defined type or certain other types, none of which allowing accessing a pointer to a function with void *.

The C standard allows a pointer to a function and a pointer to void to have different representations in memory and even different sizes, in which case this assignment would fail “mechanically”; the bytes written into f1ptr would not be suitable for use as a pointer-to-function. But even in C implementations where all pointers have the same representation, this assignment can fall afoul of optimization by the compiler. It should never be used.

The proper way to assign the result of dlsym to f1ptr is to convert the result to the necessary type:

f1ptr = (ftype) dlsym(handle, "fun1");

(This still will not let you change fun1; it just shows you how to use dlsym correctly.)

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312