38

I have about 50 different static libraries being linked into my c++ project and the linking takes on average 70s.

I've found that moving around with the link order of the libraries changes this time. This is expected I guess if the linker doesn't have to keep searching for a set of symbols throughout the entire symbol table it has built upto that point.

I suppose I could use "nm" to get a dependency graph between the static libraries. However, that would only give me one "correct" link order. What would be the factors involved in obtaining the fastest link order?

I get the feeling that it would have something to do with the above-mentioned dependency graph by getting a traversal that would try to minimize some quantity but I'm really not sure which.

Any help would be appreciated.

I am primarily using the intel compiler and also the gcc compiler every now and then. Both of them seem to be using the GNU ld linker when I check it with top. Hope this helps...

So just to clarify a bit more on what I'm trying to ask, I already know how to get a 1-pass ordering from a set of static libraries. I'd written this script myself but as Olaf's answer below suggests, there are well-known tools for doing this.

My question is, I already have two 1-pass link orderings one of which runs in ~85s and the other one runs in ~70s. So clearly, there is still some more optimization that we can do within 1-pass orders.

owagh
  • 3,428
  • 2
  • 31
  • 53
  • 1
    Probably the list of symbols/unresolved symbols, but that is more of a hunch than knowledge. Sidenote: you **must** state what linker you are interested in, as the different linkers have completely different behavior (ibm iterates multiple times over the list of libraries until it resolves everything or there is progress, for example) – David Rodríguez - dribeas Nov 13 '12 at 19:33
  • I did state I'm using the intel compiler suite so that would be ld (at least it seems to be running ld when I check with top). I also work with the gcc compiler suite now and then so that is ld too. – owagh Nov 13 '12 at 19:37
  • Just a rough idea: Write a script to permutate all possible orders of the libraries and measure link time programatically. – πάντα ῥεῖ Nov 13 '12 at 19:49
  • @g-makulik Did I mention that I have ~50 libraries with a link time of ~70s? – owagh Nov 13 '12 at 20:15
  • @owagh Yes, that's why I proposed doing this with a script or program. You need to run this only once, a long run though. I'm not sure if Olaf's answer will really yield the fastest link order. – πάντα ῥεῖ Nov 13 '12 at 20:20
  • hmmm, well given that there are only a few "correct" link orders (few compared to the 50! permutations) I suppose this _could_ be done. But I'd like to have a more efficient solution that I can use more easily. – owagh Nov 13 '12 at 20:24
  • @owagh I have to correct myself: according to what `lorder` man page says it should give you the fastest link order. Maybe you can find a version of the tool for your system. – πάντα ῥεῖ Nov 13 '12 at 20:27
  • 2
    Completely unrelated suggestion, assuming you're not doing this just for fun but also for money: get an SSD as work disk. That should speed up the linking far more than spending time tweaking link order... – hyde Nov 13 '12 at 20:31
  • @hyde Well yes, but that's not something I might have complete control over. :) – owagh Nov 13 '12 at 20:44

4 Answers4

7

As an alternative, why not try compiling your libraries to shared libraries rather than static libraries?

Where I work, one large projects link time was around 6 minutes, this was for only 5 libraries!

My solution was (for a debug version), create .so files alphabetically (libA.so, libB.so etc) so each indivdual link wasn't too long, and the final link was much shorter, because all the (partial) linking had been done previously. The release version was built in the old fashioned way because there was a perceived 'danger' with my new method.

I managed to get a 1 module compile/link cycle down from 6 minutes to 10 seconds using this method.

Neil
  • 11,059
  • 3
  • 31
  • 56
6

In the past, the order of objects in a static library was important. You can sort the objects accordingly with:

$ lorder *.o | tsort

Maybe you could do the same with your main objects and libraries, e.g. lorder main.o test.o libsome.a libthing.a | tsort. Look at man lorder

Olaf Dietsche
  • 72,253
  • 8
  • 102
  • 198
  • That sounds like an interesting utility but I don't have it installed. Can you point me to the package that has it? – owagh Nov 13 '12 at 20:17
  • I'm on Debian/Ubuntu. `dpkg -S lorder tsort` gives me `bsdmainutils` and `coreutils`. – Olaf Dietsche Nov 13 '12 at 20:22
  • Thanks! Also, lorder seems to be giving just a partial order graph so I suppose it is just equivalent to my own script that generated this information. This would only give us a "correct" total order but not necessarily one that would lead to the fastest total order. – owagh Nov 13 '12 at 20:27
  • 1
    @owagh The man page says it orders the libraries to the **optimum**, that the symbols can be loaded in one pass when linking. – πάντα ῥεῖ Nov 13 '12 at 20:30
  • "The lorder command reads one or more object or library archive files, looks for external references, and writes a list of paired filenames to standard output. The first of each pair of files contains references to identifiers that are defined in the second file. You can send this list to the tsort command to find an ordering of a library member file suitable for 1-pass access by ld." This will give *some* 1-pass ordering for ld. By optimum, they seem to be talking of the number of passes and not necessarily the total link time. – owagh Nov 13 '12 at 20:32
  • 1
    As the man page itself suggests, there could be multiple such 1-pass orderings. I already have two 1-pass ordering (which is what I meant by "correct") one of which links in 90s and the other links in 70s. – owagh Nov 13 '12 at 20:35
  • @BrianVandenberg You should add your comment as an answer on its own. Since OP hasn't accepted an answer yet, your answer might be the solution to his problem. – Olaf Dietsche Mar 07 '13 at 17:14
  • *In the past*? Certainly still is! – Noldorin Nov 28 '14 at 16:54
  • @OlafDietsche Was `lorder(1)` removed from Debian since 2012? On Debian Bullseye (Sid) I can't find `lorder(1)`: `$ apt-file find lorder | grep /lorder$` shows nothing. – alx - recommends codidact Jun 12 '21 at 21:01
  • Hmmm, it's been removed from `bsdmainutils`: https://salsa.debian.org/meskes/bsdmainutils/-/commit/bbefc0cd5d935a1eef2e0d063fabf22c6595c851 – alx - recommends codidact Jun 12 '21 at 21:08
  • And the reason for the removal in Debian: FreeBSD obsoleted `lorder(1)`: https://reviews.freebsd.org/D26044 – alx - recommends codidact Jun 12 '21 at 21:35
3

Based on information comparing ld to gold, the speed of ld is affected by how big the symbol table is. As the symbol table grows from processing object files, the slower the link step becomes. So, if you have two different 1-pass linking orders, the one that puts the libraries with a larger number of symbols to fixup later in that order should link faster. You should be able to modify a topological sort to include symbol count in the ordering criteria.

MSN
  • 53,214
  • 7
  • 75
  • 105
3

You're speaking of a one-pass ordering based on order of objects and libraries, but if it's searching through a static library it can't guarantee anything in the static library will be in any particular order and in fact you can only control that by ordering the static library a certain way when you ar it.

Furthermore, without understanding how the linker makes use of the static librar(y|ies), the two best assumptions that could be made is:

  1. It creates a hash table of symbols that references the object(s) that provide or need them; if this is an accurate assumption, then the best lower bound you can get on a static library is the time it takes to populate such a hash table and read from it.
  2. It blindly reads from the archive based on the order given in the archive's index.

As an attempt to find a lower bound on your optimal link time, try linking all or a subset of the the objects in the archive(s) as a relocatable object; for the subset, if possible identify all the objects actually linked in.

The man page for lorder indicates you can get the same results with ar ts <archive> ... which will print the ordered list for you. The man page for ar seems to indicate running ar with the s flag will automatically store that optimal ordering in the archive's index.

Also, be aware there could potentially be cyclic dependencies, though if you've already messed with tsort you should've been made aware of that already.

Finally, I'll leave you with one last piece of information. What you want is something that can solve an NP-complete problem. Good luck.


I've been running some timing tests the last little while for a build I work on; I've add the s flag to my ARFLAGS to see what effect it has.

Overall, it seems to have increased my build time but I believe there's a logical explanation for why:

  • Most of the executables/shared objects do not use static linking
  • It's building PIC and non-PIC versions of each static library

If we were making much heavier use of static libraries we'd probably see a benefit from doing this.

Brian Vandenberg
  • 4,011
  • 2
  • 37
  • 53
  • I haven't visited this question in a long time... Anyways, what do you mean by "most of the executables do not use static linking". Are you referring to your execs in particular? – owagh Jun 18 '13 at 18:23
  • Yes; most of the executables in our build exclusively link against dynamic libraries. – Brian Vandenberg Jun 18 '13 at 22:06