Edit 3: After some reading I'm pretty confident I got something wrong about macros, original text still available (this edit is after the answer was accepted), changes are marked with "edit 3", locations:
- in the "preamble"
- the question about exported imports of header-units, and macros those headers contain (and if they should affect the importer)
Edits in two locations:
- near the bottom, about the effects of importing a header-unit on the importer
- at the bottom, about OP's follow-up question on dealing with importing header-units that will trigger redefinition errors
The C++20 standard (N4868) describes the effect of importing a module or header-unit in terms of importing a TU, so it might be worthwhile to have a minimal model of what importing a TU means. [module.import] is fairly terse about that and mostly explains how you can build a DAG of modules to figure out how much a single module import will actually "import", and what transformation you apply to a header/source file to produce the header-unit/TU that ends up being imported. There is however a (non-normative) note on the intended behaviour:
[Note 1: Namespace-scope names exported by the imported translation units become visible ([basic.scope.namespace]) in the importing translation unit and declarations within the imported translation units become reachable ([module.reach]) in the importing translation unit after the import declaration.
— end note]
So essentially you produce a TU in some way, then the effect of importing should be understandable through visibility and reachability.
A "problem" with that description is that we've left out macros. As per [cpp.import], macros should only be imported when you import a header-unit (note that importing a module can lead to importing a header-unit, for example if you import a module that does export import "some_header_with_macros.h"
edit3: not "false" but misleading in this context, importing a module does not lead to importing macros, even if that module exports-imports a header-unit). The formal phrasing for that specifies when certain macro directives are "active" or "inactive".
I am trying to understand what happens when we include header files in the global module fragment. What happens to code that imports such a module?
I am tempted to say "nothing except exposing some declaration to the TU". In [module.global.frag] there is a definition for a declaration being decl-reachable from another declaration. This concept is then built upon to define discarded declarations from the global module fragment. And you have a note stating this:
[Note 2: A discarded declaration is neither reachable nor visible to name lookup outside the module unit, nor in template instantiations whose points of instantiation ([temp.point]) are outside the module unit, even when the instantiation context ([module.context]) includes the module unit.
— end note]
This a priori implies that declarations that are not discarded can be visible and/or reachable. I think I understand why reachability is required, but as of now I do not see a context in which any declaration in the global module fragment should become visible to the importer.
As for macro directives, they should be visible/active in the TU that contains the global module fragment. In [module.global.frag], the following note
[Note 1: Prior to phase 4 of translation, only preprocessing directives can appear in the declaration-seq [of the global module fragment] ([cpp.pre]).
— end note]
suggests to me that the normal phase of translation occurs to the TU that contains a global module fragment, so any macro in there would be expanded in the entire TU, not just the part of the TU that consists of the global module fragment. I also believe that none of the macros you retrieve via the global module fragment should ever propagate to importers of the module, because importing a macro is only done when you import the header-unit that defines the macro, and a module-unit isn't a header-unit.
How would the above case be different from a module that exports imported header units like export import <iostream>
?
The main difference should be the export, since that affects the visibility of everything you've imported, and that the global module fragment isn't specified to export any of the declarations that it brings in. An exported import is however specified to be transferred/to impact the importer of the current module, as per [module.import]:
When a module-import-declaration
imports a translation unit T, it also imports all translation units imported by exported module-import-declarations
in T; such translation units are said to be exported by T.
In the case of an exported header unit, would the macros in the header unit affect any headers that are included in code that imports this module?
Edit 3: I am strongly convinced this answer is wrong, see further edit after the original answer
Assuming
import A; // imports some macro FOO
// (A exports a module-import-declaration that designates a
// header-unit that defines the macro FOO)
import B; // uses some header/header-unit that could be impacted by FOO
#include "C.h" // has some declarations that could be impacted by FOO
then B should not be impacted by A, but C.h should.
To justify that claim, I think there are two relevant quotes, one is how the import directive works [cpp.import]:
An import
directive matching the first two forms of a pp-import instructs the preprocessor to import macros from the header unit ([module.import]) denoted by the header-name.
[...]
In all three forms of pp-import, the import
and export
(if it exists) preprocessing tokens are replaced by the import-keyword
and export-keyword
preprocessing tokens respectively.
[Note 1: This makes the line no longer a directive so it is not removed at the end of phase 4.
— end note]
the other would be what phase 4 of the translation process does [lex.phases]:
Preprocessing directives are executed, macro invocations are expanded, and _Pragma
unary operator expressions are executed. [...] A #include
preprocessing directive causes the named header or source file to be processed from phase 1 through phase 4, recursively.
All preprocessing directives are then deleted.
So before having to process the inclusion of C.h you should be in a state akin to
import-keyword A;
// preprocessor magic ensuring that macros imported from A are active
import-keyword B;
// preprocessor magic ensuring that macros imported from B are active
#include "C.h"
The inclusion of C.h should then be resolved "as per usual", impacted by the imports above it, whereas module B doesn't even know anything about its importer preprocessor status.
Edit 3: my new answer and what I got wrong above
After some (re-)reading through the standard, I'm fairly certain the above interpretation is erroneous, despite matching the behaviour of some implementation I've tested.
Includes and macro expansions are all resolved during translation phases 1 through 4. Importing a macro must also be done during those phases 1 through 4. The only import directives that import a macro are "the first two forms of a pp-import", both of which denote a header-name. So in other words, the only import directives that trigger a macro import, are the import directives that import a header-unit. The import directive to import a module, is "the third form of a pp-import", and that third form does not import macros.
So in the example above, neither B nor C.h should be impacted by A. Prior to handling the include of C.h, the translation unit should be in a state akin to
import-keyword A;
import-keyword B;
#include "C.h"
Specifically no macro is imported.
The inclusion of C.h should then be resolved "as per usual", so without any influence from A/B in terms of macros.
If instead of importing a module A, we were importing a header-unit formed from some header A.h, then the import directive would match one of the "first two forms of a pp-import", so macros would be imported, and those macros would influence how the preprocessor handles the inclusion of C.h.
In both cases, module B doesn't know anything about its importer's preprocessor status.
One major source of confusion (for me) was this quote from [module.import]:
When a module-import-declaration
imports a translation unit T, it also imports all translation units imported by exported module-import-declarations
in T;
I initially interpreted this to mean that if you import a module, you recursively import the exported imports of header-units, leading to some "hidden" macro importing. What I failed to notice is that [module.import] explains the effect of module-import-declarations, which are introduced by the import-keyword
, and that these module-import-declarations are not at all the same thing as an import-directive:
- An import-directive is handled by the preprocessor, so during translation phases 1 through 4. An import-directive can change the preprocessor state, and that's why it is able to import macros. The import-directive is also the only way in which you can produce an import-keyword token (and hence obtain a module-import-declaration). An import-directive does not have any recursive behaviour.
- A module-import-declaration is not handled by the preprocessor, it is a priori handled in translation phase 7, so well after the preprocessor has done its job. In particular, all macros and directives have already been handled/expanded. A module-import-declaration has some recursive behaviour as explained in [module.import] and quoted above.
So "importing" inside of a translation-unit is handled in two big steps. The import-directive handles macros in the case of header-units, and leaves behind an import-keyword in all cases. The import-keyword is like a marker so that later phases of translations will import other TUs, and be affected in terms of visibility/reachability.
Also if a module just imports a header unit without exporting it, how does code that import such a module get affected? Do the header units impact the code importing the module? If no, why does the first code snippet in my question throw so many linker errors saying ODR is violated in standard library components?
Well you have pretty much already answered that question in your own answer.
Anything you import (not just header-units but also other modules and other partitions within a module) will still at a minimum affect what declarations /definitions are visible, and if those are subject to ODR, like class definitions, you can end up with invalid TUs. Header-units are more susceptible to that in a way you made me discover because header guards/pragma once cannot be enforced, because imported modules were sort of designed to not influence other imported modules, and be import-order independent, and be processable in advance before being imported (in short, they were actually designed to be modular).
Edit 1: I feel like what you did shouldn't even trigger ODR-violations/redefinition errors, and that what I just wrote in the paragraph above shouldn't even matter/isn't how things should work.
Importing a header-unit isn't specified like an include directive. An include directive is specified like a "copy-paste". An import directive is specified to produce an import-keyword leading to "importing a TU", which affects what declarations are visible/reachable. So when importing you do not "copy-paste" anything, and you shouldn't be "redefining" anything, you should just gain access to more declarations.
In a module-only code, conflicting declarations/redefinitions from different module-units can be checked, because each module-unit is clearly "identified/named": you can track the module-unit that introduced a certain declaration, and see if a different module-unit introduced a conflicting declaration. If the same declaration from the same module-unit becomes visible via multiple different "import path", it doesn't really matter, you are guaranteed that it is the same declaration.
Because I regard the import of header-unit as a compatibility feature, and that there are already some constraints regarding what kind of headers you can import as a header-unit ([module.import]: A header unit shall not contain a definition of a non-inline function or variable whose name has external linkage.
), it does not sound too far-fetched to me that an implementation would attempt to track the filenames that introduced a declaration, and use that filename to disambiguate conflicting declarations. Arguably not all header-based libraries could work with that mechanism, but the set of importable headers is implementation-defined, so I assume each implementation would be allowed to impose constraints on what kind of header structure would be allowed.
I did some limited testing, and this seems to be how Visual Studio 17.3.6 deals with the issue. For example this will error:
// A.h
#ifndef A_H
#define A_H
struct Foo {};
#endif
// B.h
#ifndef B_H
#define B_H
struct Foo {};
#endif
// main.cpp
import "A.h";
import "B.h";
int main()
{
Foo f;
}
But this will not:
// Foo.h
#ifndef FOO_H
#define FOO_H
struct Foo {};
#endif
// A.h
#ifndef A_H
#define A_H
#include "Foo.h"
#endif
// B.h
#ifndef B_H
#define B_H
#include "Foo.h"
#endif
// main.cpp
import "A.h";
import "B.h";
int main()
{
Foo f;
}
If you've made it all the way down here, a little warning/disclaimer about the above. If I failed to make it obvious enough, this answer is based on my reading and interpretation of the C++20 standard, and I make no claim that I actually know how to read and interpret said standard correctly.
With that said, I wanted to go back to your very first question about how the global module fragment works. I like to think about the global module fragment as a form of (limited) "inline" header-unit that is imported but not exported. That is, if
- you create a header specifically for the current module-unit,
- put everything from the global module fragment inside that specific header,
- import that header as a header-unit at the beginning of the current module-unit,
then I think you would mostly achieve the same effect as using the global module fragment:
- declarations found in that fictitious header-unit would become visible/reachable in the module-unit
- those declarations shouldn't become visible to importers of the module-unit
- those declarations are in the purview of the global module
- macros from the fictitious header-unit would become active in the module-unit
- those macros shouldn't become active in the importers of the module-unit
Edit 2
I also have a new question related to importing header units which may result in multiple definition errors - how should I then go past this?
As I mentioned a little bit earlier, I feel like this is an implementation thing, but all the same implementations may not all behave the same with regards to importing header-units and that's an annoying constraint. I feel like your best shot at portable code is either:
- not to import header-units and use the global module fragment, or
- group up all headers that might trigger redefinitions issues into one intermediate header, and import that