1

I wanted to test what happens if executable and library share different versions of a library, i.e. different classes with the same name. Idea: make a test function which is called once from the executable directly, and once with the code from the library:

MWE:

base.h defines an abstract plugin class, which can generate a port object (of type base)

struct base
{
    virtual void accept(struct visitor& ) {}
    virtual void test() = 0;
    virtual ~base() {}
};

struct visitor
{
    virtual void visit(struct base& ) {}
};

struct plugin
{
    virtual ~plugin() {}
    virtual base& port() = 0;
    virtual void test() = 0;
};

typedef plugin* (*loader_t) (void);

plugin.cpp defines a derived plugin class, which can return a derived port (mport)

#include <iostream>
#include "base.h"

struct mport : public base
{
    void accept(struct visitor& ) override {}
    void test() override { std::cout << "plugin:test" << std::endl; }
    virtual ~mport() = default;
};

struct port_failure_plugin : public plugin
{
    void test() override final { inp.test(); }
    virtual ~port_failure_plugin() {}
private:
    mport inp;
    base& port() override { return inp; }
};

extern "C" {
const plugin* get_plugin() { return new port_failure_plugin; }
}

host.cpp defines a derived port class with the same name (mport)

#include <cassert>
#include <cstdlib>
#include <iostream>
#include <dlfcn.h>
#include "base.h"

struct mport : public base
{
#ifdef ACCEPT_EXTERN
    void accept(struct visitor& ) override;
#else
    void accept(struct visitor& ) override {}
#endif
    void test() override { std::cout << "host:test" << std::endl; }
};

#ifdef ACCEPT_EXTERN
void mport::accept(struct visitor& ) {}
#endif

int main(int argc, char** argv)
{
    assert(argc > 1);
    const char* library_name = argv[1];

    loader_t loader;
    void* lib = dlopen(library_name, RTLD_LAZY | RTLD_LOCAL);
    assert(lib);
    *(void **) (&loader) = dlsym(lib, "get_plugin");
    assert(loader);

    plugin* plugin = (*loader)();
    base& host_ref = plugin->port();
    host_ref.test(); // expected output: "host:test"
    plugin->test(); // expected output: "plugin:test"

    return EXIT_SUCCESS;
}

Compile e.g.:

g++ -std=c++11 -DACCEPT_EXTERN -shared -fPIC plugin.cpp -o libplugin.so
g++ -std=c++11 -DACCEPT_EXTERN -ldl -rdynamic host.cpp -o host

The complete code is on github (try make help)

In order to let the host run test "like the plugin does", it calls a virtual function, which is implemented in the plugin. So I expect that test is called

  • once from the object code of the host executable (expectation: "host:test")
  • once from the object code of the plugin library (expectation: "plugin:test")

The reality looks different:

  • In all (of the following) cases, both outputs are equal (2x"host:test" or 2x"plugin:test")
  • Compile host.cpp with -rdynamic, and without -DACCEPT_EXTERN the test calls output "plugin:test"
  • Compile host.cpp with -rdynamic, and with -DACCEPT_EXTERN (see Makefile), and the test calls call "host:test"
  • Compile host.cpp without -rdynamic, and the test calls output plugin:test (both intern and extern)

Questions:

  1. Is it even possible to call both versions of mport::test (e.g. executable and library)?
  2. Why does -rdynamic change the behavior?
  3. Why does -DACCEPT_EXTERN affect the behavior?
Johannes
  • 2,901
  • 5
  • 30
  • 50

1 Answers1

1

The thing here is you are violating the one definition rule.

Your two versions of mport::test have the same declaration, yet they do not have the same definition.

But you are doing it a dynamic linking time. Now, the C++ standard does not concern itself with dynamic loading. We have to turn to the x86 ELF ABI for further detail.

Long story short, the ABI supports a technique known as symbol interposition, which allows substituting symbols dynamically and still see consistent behavior. This is what you are doing here, though inadvertently.

You can check it manually:

spectras@etherhop$ objdump -R libplugin.so |grep test
0000000000202cf0 R_X86_64_64       _ZN19port_failure_plugin4testEv@@Base
0000000000202d10 R_X86_64_64       _ZN5mport4testEv@@Base
0000000000203028 R_X86_64_JUMP_SLOT  _ZN5mport4testEv@@Base

Here you see that, in the shared object, all uses of mport::test have a relocation entry. All calls go through the PLT.

spectras@etherhop$ objdump -t host |grep test
0000000000001328  w    F .text  0000000000000037              _ZN5mport4testEv

Here you see that host does export the symbol (because of -rdynamic). So when dynamic linking libplugin.so, the dynamic linker will use the mport::test of your main program.

That's the basic mechanism. That's also why you don't see this without -rdynamic: the host no longer exports its own version, so the plugin uses its own.

How to fix?

You can avoid all this leakage by hiding symbols (a good practice in general, it makes for faster loading and avoids name clashes).

  • add -fvisibility=hidden when compiling.
  • manually export your get_plugin function by prepending __attribute__ ((visibility("default"))) to the line. Tip: that's compiler-specific stuff, it's a good idea to make it a macro somewhere.

    #define EXPORT __attribute__((visibility("default")))
    // Later:
    EXPORT const plugin* get_plugin() { /* stuff here */ }
    

Thus:

spectras@etherhop$ g++ -std=c++11 -fvisibility=hidden -shared -fPIC plugin.cpp -o libplugin.so
spectras@etherhop$ g++ -std=c++11 -fvisibility=hidden -rdynamic host.cpp -ldl -o host
spectras@etherhop$ ./host ./libplugin.so
plugin:test
plugin:test

Another option would be to simply make the class static by enclosing it within an anonymous namespace. This will work in your simple case, but it is not good if your plugin is made of multiple translation units.

As for your expectations of having a different result on your two lines, you're getting a base-reference to a derived class, why would you expect anything other than the appropriate virtual override to be called?

spectras
  • 13,105
  • 2
  • 31
  • 53
  • Thanks, must read your answer more carefully tomorrow. For your last question, I indeed expect `plugin->test()` to call the plugin `test` function. However, I expected `host_ref.test()` to call the host.cpp function: When compiling `host`, the compiler only knows one virtual override: the one in host.cpp, so that's the only one it could call? – Johannes Aug 16 '18 at 20:52
  • 1
    It just doesn't know. Your program could do crazy things such as, say, loading other derived classes from dynamic libraries ;). So it cannot optimize away the virtual call resolution, and properly goes through the vtable. – spectras Aug 16 '18 at 20:55
  • Oh, I see, using `host_ref.test()` need not leed to the output "host:test", as this depends on runtime decisions. Do you have any idea how I can still call that function that outputs "host:test"? – Johannes Aug 16 '18 at 21:04
  • On the object returned by the plugin? No way without undefined behavior, as with the fix I suggest, the two `mport` classes are independent, incompatible types. But `host.cpp` is free to instantiate its own `mport` class. That one will now print `host:test` as expected. – spectras Aug 16 '18 at 21:08
  • Let's see if I got it. Being not conform to ODR is undefined in the C++ standard, but is defined in the ABI as an "LD_PRELOAD"-like overwrite. `-fvisibility=hidden` prevents the "LD_PRELOAD"-like preloading (or the unwanted behavior from -rdynamic). – Johannes Aug 17 '18 at 18:19
  • Yes. Hidden visibility does exactly what it says: symbols are invisible to other shared objects. It's exactly like they had a different name. – spectras Aug 18 '18 at 13:35
  • `-fvisibility=hidden` seems to have side-effects. I just added a `dynamic_cast(&plugin->port())` at the end of host.cpp. The cast worked, and the compiler/executable now thinks it has an `mport` like in host.cpp. This caused invalid writes when I added different variables to the two `mport` classes :-( There's probably no way to find out when the `dynamic_cast` goes wrong? – Johannes Aug 19 '18 at 15:53
  • No dynamic cast across plugin boundaries. That usually extends to no exceptions as well. – spectras Aug 19 '18 at 18:56
  • 1
    To expand a bit on that comment: dynamic_cast and exceptions require a typeinfo lookup, which requires RTTI information to be exported and visible. So to make your example work, you *must* export your `mport` class. And as soon as it's globally visible, then by definition it is one class, and the linker will pick one of the implementations and declare it to be *the one and only*. Now as to why it has this odd behavior… check [this question](https://stackoverflow.com/questions/19496643/using-clang-fvisibility-hidden-and-typeinfo-and-type-erasure). – spectras Aug 19 '18 at 19:13
  • Sorry, one last question. There is a difference in compiling with `ACCEPT_EXTERN` or without (`make extern` or `make intern`), see the 2nd and 3rd point of the 4 points above (it happens if visibility is not used, or if it's used and both versions of `mport::test` are exported). Is the difference just happening for "random" reasons, or is this defined behaviour? It's hard to believe that moving a function out of the class body would make any difference. – Johannes Aug 21 '18 at 21:22
  • 1
    Actually it does: defining a method within the class declaration implicitly makes it `inline`. The class now having only inline methods, none of which are used from that translation unit, your compiler chooses not to emit the class at all. On the rationale that if any other translation unit uses it, it will come with its own copy of the definition. – spectras Aug 21 '18 at 22:29