11

Consider I have a static variable in a compilation unit which ends up in a static library libA. I then have another compilation unit accessing this variable which ends up in a shared library libB.so (so libA must be linked into libB). Finally I have a main function also accessing the static variable from A directly and having a dependency to libB (so I link against libA and libB).

I then observe, that the static variable is initialized twice, i.e. its constructor is run twice! This doesn't seem to be right. Shouldn't the linker recognize both variables to be the same and optimize them as one?

To make my confusion perfect, I see it is run twice with the same address! So maybe the linker did recognize it, but did not remove the second call in the static_initialization_and_destruction code?

Here's a showcase:

ClassA.hpp:

#ifndef CLASSA_HPP
#define CLASSA_HPP

class ClassA
{
public:
    ClassA();
    ~ClassA();
    static ClassA staticA;

    void test();
};

#endif // CLASSA_HPP

ClassA.cpp:

#include <cstdio>
#include "ClassA.hpp"

ClassA ClassA::staticA;

ClassA::ClassA()
{
    printf("ClassA::ClassA() this=%p\n", this);
}

ClassA::~ClassA()
{
    printf("ClassA::~ClassA() this=%p\n", this);
}

void ClassA::test()
{
    printf("ClassA::test() this=%p\n", this);
}

ClassB.hpp:

#ifndef CLASSB_HPP
#define CLASSB_HPP

class ClassB
{
public:
    ClassB();
    ~ClassB();

    void test();
};

#endif // CLASSB_HPP

ClassB.cpp:

 #include <cstdio>
 #include "ClassA.hpp"
 #include "ClassB.hpp"

 ClassB::ClassB()
 {
     printf("ClassB::ClassB() this=%p\n", this);
 }

 ClassB::~ClassB()
 {
     printf("ClassB::~ClassB() this=%p\n", this);
 }

 void ClassB::test()
 {
     printf("ClassB::test() this=%p\n", this);
     printf("ClassB::test: call staticA.test()\n");
     ClassA::staticA.test();
 }

Test.cpp:

#include <cstdio>
#include "ClassA.hpp"
#include "ClassB.hpp"

int main(int argc, char * argv[])
{
    printf("main()\n");
    ClassA::staticA.test();
    ClassB b;
    b.test();
    printf("main: END\n");

    return 0;
}

I then compile and link as follows:

g++ -c ClassA.cpp
ar rvs libA.a ClassA.o
g++ -c ClassB.cpp
g++ -shared -o libB.so ClassB.o libA.a
g++ -c Test.cpp
g++ -o test Test.cpp libA.a libB.so

Output is:

ClassA::ClassA() this=0x804a040
ClassA::ClassA() this=0x804a040
main()
ClassA::test() this=0x804a040
ClassB::ClassB() this=0xbfcb064f
ClassB::test() this=0xbfcb064f
ClassB::test: call staticA.test()
ClassA::test() this=0x804a040
main: END
ClassB::~ClassB() this=0xbfcb064f
ClassA::~ClassA() this=0x804a040
ClassA::~ClassA() this=0x804a040

Can somebody please explain what is going on here? What is the linker doing? How can the same variable be initialized twice?

bselu
  • 194
  • 2
  • 14
  • 1
    Related (maybe duplicate): http://stackoverflow.com/questions/6714046/c-linux-double-destruction-of-static-variable-linking-symbols-overlap – jogojapan Oct 24 '14 at 12:15
  • 1
    Could it be related to compiling a static lib and then using that to compile the shared lib? So that there is init code from both ClassA.o and ClassB.o in the libB.so? – heksesang Oct 24 '14 at 13:30
  • @heksesang: Yes, it happens only in this constellation. If I make both `A` and `B` static libs or both shared libs, I do not face the issue (c'tor of `A` is run only once). However, I would expect the linker to recognize and eliminate duplicate symbols and init calls. Is my assumption wrong or is it the linker? – bselu Oct 24 '14 at 13:41

2 Answers2

9

You are including libA.a into libB.so. By doing this, both libB.so and libA.a contain ClassA.o, which defines the static member.

In the link order you specified, the linker pulls in ClassA.o from the static library libA.a, so ClassA.o initialization code is run before main(). When the first function in the dynamic libB.so is accessed, all initializers for libB.so are run. Since libB.so includes ClassA.o, ClassA.o's static initializer must be run (again).

Possible fixes:

  1. Don't put ClassA.o into both libA.a and libB.so.

    g++ -shared -o libB.so ClassB.o
    
  2. Don't use both libraries; libA.a is not needed.

    g++ -o test Test.cpp libB.so
    

Applying either of the above fixes the problem:

ClassA::ClassA() this=0x600e58
main()
ClassA::test() this=0x600e58
ClassB::ClassB() this=0x7fff1a69f0cf
ClassB::test() this=0x7fff1a69f0cf
ClassB::test: call staticA.test()
ClassA::test() this=0x600e58
main: END
ClassB::~ClassB() this=0x7fff1a69f0cf
ClassA::~ClassA() this=0x600e58
Jay West
  • 1,137
  • 8
  • 5
  • Concerning fix 1: If I don't put `libA.a` into `libB.so`, I end up with `libB.so` having an implicit dependency to a static library. So if I deliver `libB.so` and forget about `libA.a`, the receiver will get unresolved symbols and must think, I delivered an incomplete library. Can we speak of a sane library here? Concerning fix 2: This would not work, if `Test.cpp` referred to a symbol within `libA.a` which was not used by `libB.so`. When linking `libA.a` into `libB.so` the linker will throw away unused symbols of `libA.a`. So I'd still have to link my executable against `libA.a`. – bselu Oct 28 '14 at 13:45
  • Can you turn libA.a into libA.so, and ship both libA.so and libB.so? That's probably the easiest solution. The harder solution is to refactor your libraries. Each object file should exist in only one library that is used for the final link stage. – Jay West Oct 30 '14 at 17:05
7

Can somebody please explain what is going on here?

It's complicated.

First, the way that you linked your main executable and the shared library causes two instances of staticA (and all the other code from ClassA.cpp) to be present: one in the main executable, and another in libB.so.

You can confirm this by running

nm -AD ./test ./libB.so | grep staticA

It is then not very surprising that the ClassA constructor for the two instances runs two times, but it is still surprising that the this pointer is the same (and corresponds to staticA in the main executable).

That is happening because the runtime loader (unsuccessfully) tries to emulate the behavior of linking with archive libraries, and binds all references to staticA to the first globally-exported instance it observes (the one in test).

So what can you do to fix this? That depends on what staticA actually represents.

If it is some kind of singleton, that should only exist once in any program, then the easy solution is make it so that there is only a single instance of staticA. And a way to do that is to require that any program that uses libB.so also links against libA.a, and not link libB.so against libA.a. That will eliminate the instance of sttaicA inside libB.so. You've claimed that "libA must be linked into libB", but that claim is false.

Alternatively, if you build libA.so instead of libA.a, then you can link libB.so against libA.so (so libB.so is self-contained). If the main application also links against libA.so, that wouldn't be a problem: there will only be one instance of staticA inside libA.so, not matter how many times that library is used.

On the other hand, if staticA represents some kind of internal implementation detail, and you are ok with having two instances of it (so long as they don't interfere with each other), then the solution is to mark all of ClassA symbols with hidden visibility, as this answer suggests.

Update:

why the linker does not eliminate the second instance of staticA from the executable.

Because the linker does what you told it to do. If you change your link command line to:

g++ -o test Test.cpp libB.so libA.a

then the linker should not link ClassA into the main executable. To understand why the order of libraries on command line matters, read this.

Community
  • 1
  • 1
Employed Russian
  • 199,314
  • 34
  • 295
  • 362
  • I still don't get it, why the linker does not eliminate the second instance of `staticA` from the executable. In theory this should be possible. – bselu Oct 28 '14 at 13:34
  • Okay, I see. Reading your update I first thought a change in the link order might solve my problem in general. However, it does not. If I add another `libC.so` to the example also accessing `staticA` and linked against `libA.a`, then again the c'tor is called twice. The only sensible solution I see is not linking libA.a into libB.so. But then libB.so has an implicit (not visible) dependency to a static lib. Is this allowed? – bselu Oct 29 '14 at 08:11