What does it mean to 'add an index to an archive file'?

Question

My C textbook creates an archive using the -s command line option which adds an index to the following archive file.

ar -rcs libfile.a file1.o file2.o

However, I don't understand what an index is or it's purpose?

It means to preprocess all the members so that when you search for a symbol it's quick and all the members don't need to be scanned. It's always desirable for an archive that's used as a static library. — Petr Skocik, Aug 25 '19 at 17:13
There's a long, complicated history tied up in part with the `ranlib` program. Basically, for the linker to be able to scan a library efficiently, some program — either `ar` itself or `ranlib` — adds a special 'index' file which identifies symbols defined in the library and the object file within the archive that defines that symbol. Many versions of `ar` add that automatically if any of the saved files is an object file. Some, it seems, require prodding with the `-s` option. Others don't add the index file at all and rely on `ranlib` to do the job. — Jonathan Leffler, Aug 25 '19 at 17:14
The `ar` on macOS documents: _`-s` — Write an object-file index into the archive, or update an existing one, even if no other change is made to the archive. You may use this modifier flag either with any operation, or alone. Running `ar s` on an archive is equivalent to running `ranlib` on it._ I've not needed to use this option explicitly on macOS for a long time (nor have I run `ranlib`) — I think things changed sometime in the middle of the first decade of the millennium. — Jonathan Leffler, Aug 25 '19 at 17:17
Don't object files already come with symbol tables? Wouldn't it seem redundant to add another symbol table to the start of the archive file? — duper21, Aug 25 '19 at 17:17
Object files contain information about what's in the one object file (and information about referenced objects as well as defined ones); the archive index contains much simpler information about which of the many object files in the archive defines each symbol, so that the linker doesn't have to scan each object file in the archive separately. — Jonathan Leffler, Aug 25 '19 at 17:19
@JonathanLeffler So would it be correct to say that the index at the start of the archive is just a giant symbol table which replaces the individual symbol tables in each object file so the linker has an easier job? — duper21, Aug 25 '19 at 17:23
Not replaces — augments. It allows the linker to identify which object file(s) to pull into the linked executable. The linker then processes the object files just as it does any other object file, checking for definitions and references, marking newly defined references as satisfied and identifying not previously used references that are not defined. But the linker doesn't have to read every file in the archive to work out which symbols are defined — it knows from the index file which ones are defined. — Jonathan Leffler, Aug 25 '19 at 17:37
@JonathanLeffler To clarify the index allows the linker to find the specific object file which defines a symbol rather than having to scan every object file to resolve a symbol? — duper21, Aug 25 '19 at 17:51
@U.Windl — Lack of time, or maybe a shortage of round tuits (so there was no opportunity to get a round tuit before about now). — Jonathan Leffler, Aug 26 '19 at 04:38

score 5 · Answer 1 · answered Aug 26 '19 at 04:36

^{Converting comments into an answer.}

There's a long, complicated history tied up in part with the ranlib program. Basically, for the linker to be able to scan a library efficiently, some program — either ar itself or ranlib — adds a special 'index' file which identifies the symbols defined in the library and the object file within the archive that defines each of those symbols. Many versions of ar add that automatically if any of the saved files is an object file. Some, it seems, require prodding with the -s option. Others don't add the index file at all and rely on ranlib to do the job.

The ar on macOS documents:

-s — Write an object-file index into the archive, or update an existing one, even if no other change is made to the archive. You may use this modifier flag either with any operation, or alone. Running ar s on an archive is equivalent to running ranlib on it.

I've not needed to use this option explicitly on macOS for a long time (nor have I run ranlib) — I think things changed sometime in the middle of the first decade of the millennium.

Don't object files already come with symbol tables? Wouldn't it seem redundant to add another symbol table to the start of the archive file?

Each object file contains information about what's in that one object file (and information about referenced objects as well as defined ones); the archive index contains much simpler information about which of the many object files in the archive defines each symbol, so that the linker doesn't have to scan each object file in the archive separately.

So, would it be correct to say that the index at the start of the archive is just a giant symbol table which replaces the individual symbol tables in each object file so the linker has an easier job?

Not replaces — augments. It allows the linker to identify which object file(s) to pull into the linked executable. The linker then processes the object files just as it does any other object file, checking for definitions and references, marking newly defined references as satisfied and identifying previously unused references that are not yet defined. But the linker doesn't have to read every file in the archive to work out which symbols are defined by the file — it knows from the index file which ones are defined.

To clarify the index allows the linker to find the specific object file which defines a symbol rather than having to scan every object file to resolve a symbol?

Yes, that’s right.

What does it mean to 'add an index to an archive file'?

1 Answers1