1

I have a project with thousands of C files, many libraries, and dozens of programs to link, and to speed up the compilation, I am combining C files into translation units that include multiple C files. This is sometimes referred to as single compilation unit, single translation unit, or unity build.

I have multiple of these translation units compiled into different libraries, and these libs were previously created by compiling each C file individually.

For example:

old library.lib:

file1.o
file2.o
file3.o
file4.o
file5.o
file6.o

new library.lib:

translation_unit_1.o
translation_unit_2.o

translation_unit_1.c:

#include "file1.c"
#include "file2.c"
#include "file3.c"

translation_unit_2.c:

#include "file4.c"
#include "file5.c"
#include "file6.c"

So these compile into: translation_unit_1.o and translation_unit_2.o. And the library is the new library.lib shown above.

Now say I have a program that I want to link to library.lib that refers to a function in file2.c. But has a different version of file1.c that it compiles that duplicates symbols in the file1.c in the library, so it only needs file2.c from the library.lib to link. Or perhaps I have a need to link code from file1.c but can't link file2.c because it has a dependency that I don't want to rely on (example below).

program:

main.o
file1.o
library.lib

Is there a way with any linker that you know of to get the linker to only pull the code from file2.c out of translation_unit_1.o object code and use that to link main.o to make the program?

An alternative would be to split the translation_unit_1.o out into file1.o, file2.o, file3.o if that is possible, then feed that to the linker.

Thanks for any help.

edit 1

This is for single code base that is compiled for both a bare metal ARM platform that uses ELF compiled with ARM ADS 1.2 toolchain and for a Windows platform that uses the Visual Studio toolchain. However thoughts on how to approach the problem on other platforms and toolchains are welcome.

Here is a concrete example on MacOS using clang.

example code below is here: https://github.com/awmorgan/single_translation_unit_lib_link

library:

file1.c this file is needed to link

file2.c this file is not used to link and has an unresolved dependency which could be in another library or object

main.c:

int main( void ) {
    extern int file1_a( void );
    int x = file1_a();
}

file1.c:

int file1_a(void) {
    return 1;
}

file2.c:

int file2_a( void ) {
    extern int file3_a( void );
    return file3_a(); // file3_a() is located somewhere else
}

single_translation_unit.c:

#include "file1.c"
#include "file2.c"

this works to produce program1.out:

++ clang -c file1.c -o file1.o
++ clang -c file2.c -o file2.o
++ libtool -static file1.o file2.o -o library1.lib
++ clang -c main.c -o main1.o
++ clang main1.o library1.lib -o program1.out

this fails to produce program2.out:

++ clang -c single_translation_unit.c -o single_translation_unit.o
++ libtool -static single_translation_unit.o -o library2.lib
++ clang -c main.c -o main2.o
++ clang main2.o library2.lib -o program2.out
Undefined symbols for architecture x86_64:
  "_file3_a", referenced from:
      _file2_a in library2.lib(single_translation_unit.o)
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)

changing the link order does not work either:

++ clang library2.lib main2.o -o program2.out
Undefined symbols for architecture x86_64:
  "_file3_a", referenced from:
      _file2_a in library2.lib(single_translation_unit.o)
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Bill Morgan
  • 538
  • 2
  • 14
  • https://stackoverflow.com/questions/28607965/gcc-linker-library-search-order-paths-plus-static-vs-shared – stark Feb 09 '19 at 14:47
  • @Stark, that link refers to how to search for libraries, not how to search/extract objects from libraries. – Paul Ogilvie Feb 09 '19 at 16:18
  • @Paul Yes, OP could probably solve their problem by reordering the libraries when linking to put newest first. – stark Feb 09 '19 at 17:32
  • It seems to be time to learn how to use `make` then write a `Makefile` the performs all the desired compile/link functionality. – user3629249 Feb 09 '19 at 19:51

2 Answers2

0

Is there a way with clang, gcc, microsoft or any linker

None of clang, gcc or microsoft is a linker (the first two are compilers, and the third is a corporation).

The answer also depends on the platform (which you didn't specify).

IF you are building on a Linux, or another ELF platform, you could compile your code with -ffunction-sections -fdata-sections, and the linker will automagically do what you want.

Is there a way to have a linker pull part of an object file from a library for linking?

In general, linkers operate on sections, and can't split sections apart (you get all or nothing).

Without -ffunction-sections, all functions in a single translation unit end up in a single .text section (this is an approximation -- template instantiations and out-of-line function definitions for inline functions usually end up in a section of their own). Therefore, the linker can't select some, but not all, parts of the .text.

Employed Russian
  • 199,314
  • 34
  • 295
  • 362
0

With the GCC/binutils ELF toolchain, or suitably compatible tools, you can do this by:

  1. Compiling single_translation_unit.c with the options -ffunction-sections, -fdata-sections
  2. Linking program2.out with the linker option option -gc-sections.

E.g. (on Linux):

$ gcc -ffunction-sections -fdata-sections -c single_translation_unit.c -o single_translation_unit.o
$ ar rcs library2.a single_translation_unit.o # On Mac OS, use libtool to make the static library if you prefer.
$ gcc -c main.c -o main2.o
$ gcc main2.o library2.a -Wl,-gc-sections -o program2.out

You may replace gcc with clang throughout.

The linkage succeeds because:

  • In compilation, -ffunction-sections directed the compiler to emit each function definition in a distinct code section of the object file, containing nothing else, rather than merging them all into a single .text section, as per default.
  • In the linkage, -Wl,-gc-sections directed the linker to discard unused sections, i.e. sections in which no symbols were referenced by the program.
  • The definition of the unreferenced function file2_a acquired a distinct code section, containing nothing else, which was therefore unused. The linker was able to discard this unused section, and along with it the unresolved reference to file3_a within the definition of file2_a.

So no references to file2_a or file3_a were finally linked, as we can see:

$ nm program2.out | egrep '(file2_a|file3_a)'; echo Done
Done

And if we re-do the linkage requesting a mapfile:

$ gcc main2.o library2.a -Wl,-gc-sections,-Map=mapfile -o program2.out

then the mapfile will show us:

...
...
Discarded input sections

 ...
 ...
 .text.file2_a  0x0000000000000000        0xb library2.a(single_translation_unit.o)
 ...
 ...

that the function section text.file2.a originating in library2.a(single_translation_unit.o) was indeed thrown away.

BTW...

Because of the way a static library is used in linkage, there is no point in archiving the single object file single_translation_unit.o alone into a static library library2 and then linking your program against library2, if you know that your program references any symbol defined in single_translation_unit.o. You might as well skip creating library2 and just link single_translation_unit.o instead. Given that symbols defined in single_translation_unit.o are needed, the linkage:

$ gcc main2.o library2.a [-Wl,-gc-sections] -o program2.out

is exactly the same linkage as:

$ gcc main2.o single_translation_unit.o [-Wl,-gc-sections] -o program2.out

with or without -Wl,-gc-sections.

And...

I trust you're aware that while a unity build well be fastest for your builds-from-clean, it may equally well be slow for most incremental builds, as against an automated build system, typically Make based, that is well-crafted to minimise the amount of rebuilding required per source change. Chances are if you can benefit from a unity build, it's only from a unity build as well as an efficient incremental build.

Mike Kinghan
  • 55,740
  • 12
  • 153
  • 182