C++ Modules and the C++ ABI

Question

I've been reading about the C++ modules proposal (latest draft) but I don't fully understand what problem(s) it aims to solve.

Is its purpose to allow a module built by one compiler to be used by any other compiler (on the same OS/architecture, of course)? That is, does the proposal amount to standardizing the C++ ABI?

If not, is there another proposal being considered that would standardize the C++ ABI and allow compilers to interoperate?

score 33 · Accepted Answer · answered Jan 31 '12 at 00:44

Pre-compiled headers (PCH) are special files that certain compilers can generate for a .cpp file. What they are is exactly that: pre-compiled source code. They are source code that has been fed through the compiler and built into a compiler-dependent format.

PCHs are commonly used to speed up compilation. You put commonly used headers in the PCH, then just include the PCH. When you do a #include on the PCH, your compiler does not actually do the usual #include work. It instead loads these pre-compiled symbols directly into the compiler. No running a C++ preprocessor. No running a C++ compiler. No #including a million different files. One file is loaded and symbols appear fully formed directly in your compiler's workspace.

I mention all that because modules are PCHs in their perfect form. PCHs are basically a giant hack built on top of a system that doesn't allow for actual modules. The purpose of modules is ultimately to be able to take a file, generate a compiler-specific module file that contains symbols, and then some other file loads that module as needed. The symbols are pre-compiled, so again, there is no need to #include a bunch of stuff, run a compiler, etc. Your code says, import thing.foo, and it appears.

Look at any of the STL-derived standard library headers. Take <map> for example. Odds are good that this file is either gigantic or has a lot of #inclusions of other files that make the resulting file gigantic. That's a lot of C++ parsing that has to happen. It must happen for every .cpp file that has #include <map> in it. Every time you compile a source file, the compiler has to recompile the same thing. Over. And over. And over again.

Does <map> change between compilations? Nope, but your compiler can't know that. So it has to keep recompiling it. Every time you touch a .cpp file, it must compile every header that this .cpp file includes. Even though you didn't touch those headers or source files that affect those headers.

PCH files were a way to get around this problem. But they are limited, because they're just a hack. You can only include one per .cpp file, because it must be the first thing included by .cpp files. And since there is only one PCH, if you do something that changes the PCH (like add a new header to it), you have to recompile everything in that PCH.

Modules have essentially nothing to do with cross-compiler ABI (though having one of those would be nice, and modules would make it a bit easier to define one). Their fundamental purpose is to speed up compile times.

score 12 · Answer 2 · answered Jan 31 '12 at 00:20

Modules are what Java, C#, and a lot of other modern languages offer. They immensely reduce compile time simply because the code that's in today's header doesn't have to be parsed over and over again, everytime it's included. When you say #include <vector>, the content of <vector> will get copied into the current file. #include really is nothing else than copy and paste.

In the module world, you simply say import std.vector; for example and the compiler loads the query/symbol table of that module. The module file has a format that makes it easy for the compiler to parse and use it. It's also only parsed once, when the module is compiled. After that, the compiler-generated module file is just queried for the information that is needed.

Because module files are compiler-generated, they'll be pretty closely tied to the compiler's internal representation of the C++ code (AST) and will as such most likely not be portable (just like today's .o/.so/.a files, because of name mangling etc.).

score 8 · Answer 3 · answered Jul 20 '12 at 20:28

Modules in C++ have to be primarily better thing than today solutions, that is, when a library consists of a *.so file and *.h file with API. They have to solve the problems that are today with #includes, that is:

require macroguards (macros that prevent that definitions are provided multiple times)
are strictly text-based (so they can be tricked and in normal conditions they are reinterpreted, which gives also a chance to look differently in different compilation unit to be next linked together)
do not distinguish between dependent libraries being only instrumentally used and being derived from (especially if the header provides inline function templates)

Despite to what Xeo says, modules do not exist in Java or C#. In fact, in these languages "loading modules" relies on that "ok, here you have the CLASSPATH and search through it to find whatever modules may provide symbols that the source file actually uses". The "import" declaration in Java is no "module request" at all - the same as "using" in C++ ("import ns.ns2.*" in Java is the same as "using namespace ns::ns2" in C++). I don't think such a solution can be used in C++. The closest approximation I can imagine are packages in Vala or modules in Tcl (those from 8.5 version).

I imagine that C++ modules are rather not possible to be cross-platform, nor dynamically loaded (requires a dedicated C++ dynamic module loader - it's not impossible, but today hard to define). They will definitely by platform-dependent and should also be configurable when requested. But a stable C++ ABI is practically only required within the range of one system, just as it is with C++ ABI now.

Since Java 9, there are in fact modules in Java. – Pedro Lamarão Mar 08 '22 at 16:21 — Pedro Lamarão, Mar 08 '22 at 16:21

C++ Modules and the C++ ABI

3 Answers3