Static Linking: Merging Binaries with Common Symbols

Question

we have a collection of (C and C++11) code that gets compiled multiple times, producing distinct static libraries whose symbols conflict with each other. For various reasons, we are unable to change the way the source is organized/namespaced.

An example use case for the sake of this question:

We are creating a wrapper/driver for multiple vendors
We are using a combination of older and newer compilers (GCC 4.9 ... 7.x, and CLANG 4.0 ... 11.0.1)
We release both dynamically and statically linked solutions
The current code base has a lot of duplication (which we cannot change/fix)
This duplication includes conflicting symbol names
Currently, we release distinct variants, one per vendor, so there are no conflicts
We are now required to release multi-vendor variants where (a subset of) all vendors are represented in the release

For the dynamically linked variant, we have a working solution, so we can ignore that for the sake of this question.

For static linking, we have tried a number of approaches, with varying levels of success. We are looking for a solution that doesn't involve changing the source code (some of which comes from third parties and we have no control over it).

The main thing we tried for static linking was to use objcopy to rename the offending symbols so they become unique (using the --redefine-syms <file> option). Using that option, we have been able to run tests successfully. However, the DWARF debug symbols do not get renamed, so a debugger (e.g. GDB) is unable to locate key symbols (like object virtual tables) because the symbol has been renamed but the corresponding .debug_str entry for it has not.

So we are left with a number of alternatives, which is why I'm reaching out to you today.

Option 1: Also rename the debug symbols.

This sounds simple enough, but is actually quite complicated:

objcopy does not have an option for this (even the GNU specific --debugging option doesn't recognize the DWARF format, or at least the variant we are trying)
libdwarf doesn't have APIs to replace information (as opposed to "producing" new information).
patchelf can't do this.
neither GCC nor CLANG appear to have options to manipulate this
attempting to interpret DWARF information manually is likely not an option (we don't have complete control over debug symbol generation, since clients are free to use any compiler)

First question: how can I tell what kind and version of debug information is present in an object file?

Unless there's a tool out there that can perform this operation in some way (and I've been looking for a while now), all I can think of doing is to manually update the .debug_str section by choosing renamed symbols that have exactly the same string length as existing strings. That way the string table can be modified in-place without needing to modify offsets in other sections.

But this also seems to be very specific to a particular type of debug info, and may not be transferable or forward compatible (which is why I'd much rather use an intermediate tool to do the job).

Option 2: Somehow influence the compilation.

I don't know how to do this, or whether it even makes sense, but if there's a way to instruct the compiler to encapsulate the code it finds into an enclosing namespace or equivalent, then both the symbol names and the debug symbols would be "well formed" and consistent with each other, and there would be no symbol conflicts.

The main downside of this is that we have some libraries (internal or external) that we do not want to rename, which makes it difficult (or impossible) to specify to the compiler correctly.

Option 3: Anything else I haven't thought of?

Hopefully you have some useful suggestions for me to explore.

Thanks for your attention.

Static Linking: Merging Binaries with Common Symbols

Option 1: Also rename the debug symbols.

Option 2: Somehow influence the compilation.

Option 3: Anything else I haven't thought of?

0 Answers0