9

I'd like to use libdl to dynamically load C++ in general. The problem is identifying symbols at runtime that have been name mangled.

As described here, one solution is to remove name mangling by using extern "C".

http://www.tldp.org/HOWTO/C++-dlopen/theproblem.html

This solution has the drawback of limiting dynamically loaded resources to C style interfaces. Dynamically loaded functions cannot, for instance, be overloaded functions.

What is a good way to overcome this limitation?

One possible solution would be tools to name mangle the library source code with an accompanying function to get the mangled names when the library needs to be linked. Does llvm provide tools for this?

Maybe a clumsy solution would be a function that takes a function signature, creates dummy code with a function that has the signature, pipes into the compiler that was used with a flag for generating assembly, parses the output to retrieve the mangled name, and returns the mangled name as a string. The string could then be passed to dlsym().

To keep the problem concrete, here are two example programs that illustrate something the extern "C" solution can't dynamically load without modifying library code. The first dynamically links a library in traditional C++ fashion. The second uses dlopen. Linking an overloaded function in the first program is simple. There's no simple way to link the overloaded function in the second program.

Program 1: Loadtime Dynamic Linking

main.cpp

// forward declarations of functions that will be linked
void say(int);
void say(float);

int main() {
    int myint = 3;
    say(myint);
    float myfloat = 5.0f;
    say(myfloat);
}

say.cpp

#include <iostream>

//extern "C" function signatures would collide

//extern "C" void say(int a) {
void say(int a) {
    std::cout << "The int value is " << a << ".\n";
}

//extern "C" void say(float a) {
void say(float r) {
    std::cout << "The float value is " << r << ".\n";
}

output

$ ./main
The int value is 3.
The float value is 5.

Program 2: Runtime Dynamic Linking

main_with_dl.cpp

#include <iostream>
#include <dlfcn.h>

int main() {
    // open library
    void* handle = dlopen("./say_externC.so", RTLD_LAZY);
    if (!handle) {
        std::cerr << "dlopen error: " << dlerror() << '\n';
        return 1;
    }

    // load symbol
    typedef void (*say_t)(int);

    // clear errors, find symbol, check errors
    dlerror();
    say_t say = (say_t) dlsym(handle, "say");
    const char *dlsym_error = dlerror();
    if (dlsym_error) {
        std::cerr << "dlsym error: " << dlsym_error << '\n';
        dlclose(handle);
        return 1;
    }

    // use function
    int myint = 3;
    say(myint);
    // can't load in void say(float)
    // float myfloat = 5.0f;
    // say(myfloat);

    // close library
    dlclose(handle);
}

output

$ ./main_with_dl
The int value is 3.

Compiling

Makefile

CXX = g++

all: main main_with_dl say_externC.so

main: main.cpp say.so
    $(CXX) -o $@ $^

main_with_dl: main_with_dl.cpp
    $(CXX) -o $@ $<

%.so : %.cpp
    $(CXX) -shared -o $@ $<

.PHONY: clean
clean:
    rm main main_with_dl say.so say_externC.so
Praxeolitic
  • 22,455
  • 16
  • 75
  • 126
  • 1
    I was researching how you can pass the decorated name to `dlsym`, but it occurs to me that if you're trying to call an overloaded function _and a conversion will be necessary_ then that's not enough to cut it. Since there's no way (yet) in C++ to determine the signature of an overloaded function that would be called after conversions, Your only option I know of is to make wrappers that resolve the overloads in Program2 that can delegate to Program1. Unless... `typeid`? – Mooing Duck Jun 06 '14 at 18:46
  • Losing automatic conversion is not a tragedy and importantly that can be worked around in *client* code. – Praxeolitic Jun 06 '14 at 18:49
  • but it means you must have a wrapper with the exact same signature in the client code, which means overloads can't be added later to the library. Is that a problem for you? – Mooing Duck Jun 06 '14 at 18:50
  • You understand that even with a solution such as you propose, you still couldn't use objects in such an environment, right? – Michael Kohne Jun 06 '14 at 18:50
  • @MichaelKohne You mean that once you get function pointers from dlsym() you still need to know how to call and what to expect as the return value? – Praxeolitic Jun 06 '14 at 18:52
  • @MooingDuck I see what you're saying. That's not a huge problem. If the client code has to have function pointer names that indicate signature that's fine. – Praxeolitic Jun 06 '14 at 18:59
  • Just noticed: "creates dummy code with a function that has the signature, pipes into the compiler that was used with a flag for generating assembly, parses the output to retrieve the mangled name, and returns the mangled name as a string." This is the sort of thing that explains the awesomeness that is Visual Studio lib/dll pairs. – Mooing Duck Jun 06 '14 at 19:02
  • @MooingDuck Wait, I don't see what you're saying. Why couldn't other overloads be added to the library? – Praxeolitic Jun 06 '14 at 19:02
  • @MooingDuck Ha, what is it that Visual Studio lets you do? – Praxeolitic Jun 06 '14 at 19:04
  • @Praxeolitic: because each "wrapper" in the client would have a 1:1 relation with one in the library. If you add one in the library, there's no wrapper in the client that calls it, so the client would never use it. – Mooing Duck Jun 06 '14 at 19:04
  • 1
    @Praxeolitic: When Visual Studio generates a dynamic library, it also generates a link library. You link your client code with the link library, and the link library knows the signatures and mangled names and such, and automagically resolves all of this. I didn't even realize that this stuff was complicated until I did research for this question, it just works in Windows. Actually, just realized, none of that applies to dlls that you don't know the names of, like plugins. Nevermind. – Mooing Duck Jun 06 '14 at 19:05
  • @MooingDuck Got it. Well if the code dynamically loading the library was itself library code that might be a concern but let's not get too crazy. If something gets added to the library assume a human may or may not choose to use that something in the client code. – Praxeolitic Jun 06 '14 at 19:07
  • @MooingDuck That Visual Studio solution is pretty good. I might try implementing that. – Praxeolitic Jun 06 '14 at 19:10
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/55225/discussion-between-mooing-duck-and-praxeolitic). – Mooing Duck Jun 06 '14 at 19:12
  • @Praxeolitic - Compilers have an enormous amount of lee-way in how they lay out objects in memory, and how they construct vtables, where they put the vtables, etc. All of which can change not only from compiler to compiler, but between compiler versions. Add in the various optimizations and structure packing options, and I don't see how you can possibly make this work for objects - there's just no sensible way for one compiler to understand another's object layout. – Michael Kohne Jun 06 '14 at 19:14
  • @MichaelKohne Good point. That is indeed a limitation but virtual base classes that provided known interfaces could go pretty far for interacting with objects that were unknown at link time. – Praxeolitic Jun 06 '14 at 19:25
  • 1. Run `nm` on your dynamic library. 2. Run `c++filt` on the output of `nm`. 3. Voila! You have a table that maps between mangled and demangled form of names. 4. Use it. – n. m. could be an AI Jun 06 '14 at 19:45
  • 1
    @Praxeolitic - Sadly, I don't think that even really works. If you're dealing with different compilers, you don't even have a guarantee of what order they build their vtables, so even pure virtual bases are no good. – Michael Kohne Jun 06 '14 at 19:48

1 Answers1

4

Thanks to Mooing Duck I was able to come up with a solution using clang and inspired by Visual Studio.

The key is a macro provided by Visual Studio and clang. The __FUNCDNAME__ macro resolves to the mangled name of the enclosing function. By defining functions with the same signature as the ones we want to dynamically link, we can get __FUNCDNAME__ to resolve to the needed name mangle.

Here's the new version of program 2 that can call both void say(int) and void say(float).

EDIT Mooing Duck dropped more knowledge on me. Here's a version of main_with_dl.cpp that works with say.cpp in the question.

#include <iostream>
#include <dlfcn.h>

void* handle;

template<class func_sig> func_sig get_func(const char* signature)
{
    dlerror();
    func_sig func = (func_sig) dlsym(handle, signature);
    const char *dlsym_error = dlerror();
    if (dlsym_error) {
        std::cerr << "dlsym error: " << dlsym_error << '\n';
        dlclose(handle);
        exit(1);
    }
    return func;
}

void say(int a) {
    typedef void(*func_sig)(int);
    static func_sig func = get_func<func_sig>(__FUNCDNAME__);
    return func(a);
}

void say(float a) {
    typedef void(*func_sig)(float);
    static func_sig func = get_func<func_sig>(__FUNCDNAME__);
    return func(a);
}

int main() {
    // open library
    //void* handle = dlopen("./say_externC.so", RTLD_LAZY);
    handle = dlopen("./say.so", RTLD_LAZY);
    if (!handle) {
        std::cerr << "dlopen error: " << dlerror() << '\n';
        return 1;
    }

    // use function
    int myint = 3;
    say(myint);
    float myfloat = 5.0f;
    say(myfloat);

    // close library
    dlclose(handle);
}

http://coliru.stacked-crooked.com/a/7249cc6c82ceab00

The code must be compiled using clang++ with the -fms-extensions flag for __FUNCDNAME__ to work.

Enlico
  • 23,259
  • 6
  • 48
  • 102
Praxeolitic
  • 22,455
  • 16
  • 75
  • 126
  • 1
    http://coliru.stacked-crooked.com/a/62dffb457ca3eb6a, less code duplication, and fewer calls to `dlerror()` after the symbol has already been resolved once. – Mooing Duck Jun 06 '14 at 20:15
  • The links (both in the comment, @MooingDuck, and in the answer) do not work anymore. – Enlico Sep 11 '22 at 06:48
  • They work for me. I've inlined the code from the answer, for future resiliance. – Mooing Duck Sep 12 '22 at 16:30