3

I have two files Interface.cppm (Primary Module Interface Unit) and main.cpp. I don't have any other module units for this module.

In Interface.cppm, I have the following contents

module;

#include <cstdint>

export module Interface; 

import <algorithm>;
import <iostream>;
import <memory>;
import <sstream>;
import <string>;
import <tuple>;
import <type_traits>;
import <vector>;

//Code that this interface exports and
//implementation details.

Im main.cpp, I have the following code:

import Interface;
import <iostream>;
import <memory>;
import <string>;

int main(){
    //Using the contents of Interface module
}

I precompiled the header units and put them in a folder called header-units. I then compiled my code using the following commands:

clang++ -std=c++20 Interface.cppm -fmodule-file=./header-units/algorithm.pcm -fmodule-file=./header-units/iostream.pcm --precompile -fmodule-file=./header-units/memory.pcm -fmodule-file=./header-units/sstream.pcm -fmodule-file=./header-units/string.pcm -fmodule-file=./header-units/tuple.pcm -fmodule-file=./header-units/type_traits.pcm -fmodule-file=./header-units/vector.pcm -fmodule-file=./header-units/unordered_map.pcm -o Interface.pcm    //This works fine

clang++ -std=c++20 main.cpp -fmodule-file=Interface.pcm -fmodule-file=./header-units/iostream.pcm -fmodule-file=./header-units/string.pcm -fmodule-file=./header-units/memory.pcm -c -o main.o   //This works fine

clang++ -std=c++20 Interface.pcm -c -o Interface.o   //This works fine

clang++ -std=c++20 Interface.o main.o -o output

Following the last command, I got a series of linker errors similar to the following:

usr/bin/ld: main.o: in function `std::bad_alloc::bad_alloc()':
main.cpp:(.text+0x0): multiple definition of `std::bad_alloc::bad_alloc()'; Interface.o:Interface.pcm:(.text+0x0): first defined here
/usr/bin/ld: main.o: in function `std::exception::exception()':
main.cpp:(.text+0x40): multiple definition of `std::exception::exception()'; Interface.o:Interface.pcm:(.text+0x40): first defined here
/usr/bin/ld: main.o: in function `std::bad_array_new_length::bad_array_new_length()':
<and many others>

I tried other things like exporting the header units from the Interface module and not importing these header units in main.cpp like this:

//Interface.cppm
module;
#include <cstdint>
export module Interface;
export import <iostream>;
export import <memory>;
export import <string>;
import <algorithm>;
....


//main.cpp
import Interface;

int main(){
    //Code using the Interface
}

but this had the same effect i.e. linker errors for multiple definitions in standard library components. I am not sure what I am doing wrong here. Would be great if someone can help me with this.

Update - I managed to get rid of this problem (by a trial and error method) by doing this:

//Interface.cppm
module;
#include <algorithm>
#include <cstdint>
#include <iostream>
...
export module Interface;
//Code that this interface exports and
//implementation details.

I changed all the imports to includes in the global module fragment in Interface.cppm.

//main.cpp
import Interface;
import <iostream>;
import <memory>;
import <string>;
int main(){
   //Code that uses the Interface module
}

In main.cpp, I just left the imports as they were.

This was able to link fine but I am still not sure why.

I am trying to understand what happens when we include header files in the global module fragment. What happens to code that imports such a module?

How would the above case be different from a module that exports imported header units like export import <iostream>?

In the case of an exported header unit, would the macros in the header unit affect any headers that are included in code that imports this module?

Also if a module just imports a header unit without exporting it, how does code that import such a module get affected? Do the header units impact the code importing the module? If no, why does the first code snippet in my question throw so many linker errors saying ODR is violated in standard library components?

If someone can help me understand this, it would go a long way in helping me understand modules better.

user17799869
  • 125
  • 9

3 Answers3

5

I found out the answer myself as to why I am getting redefinition errors.

I got the answer after checking this CPPCon video by Nathan Sidwell starting from timestamp 9 minutes and 50 seconds. Nathan Sidwell attempted to convert TinyXML2 to use modules and he encountered multiple definitions error with standard library components just like I did.

I will summarize what he said here:

Normally to avoid multiple definition errors when a header file is included more than once in the same translation unit, we use a include guard.

Suppose say we have the following files:

//widget.h
#ifndef _WIDGET_H
#define _WIDGET_H

class Widget {...};

#endif

//foo.h
#ifndef _FOO_H
#define _FOO_H

#include "widget.h"
...

#endif

//bar.cpp
#include "widget.h"
#include "foo.h"

...

In this case, the include guards in widget.h will prevent the widget class definition from being included twice in the translation unit corresponding to bar.cpp.

However if we do this:

//widget.h and foo.h as above

//bar.cpp
#include "widget.h"

import "foo.h";

the code will fail to compile because of multiple definition errors for class Widget in the translation unit corresponding to bar.cpp. This is because header units (here we are importing foo.h as a header unit) are different in the sense that include guards don't work on them.

Here the #include "widget.h" within foo.h is a problem. The header guards within widget.h will not prevent its contents from being copied into the translation unit for bar.cpp even though it has already been included directly by bar.cpp which will result in class Widget being defined twice in this translation unit which violates ODR.

It is exactly the same thing that is happening in my code. The problem was with my primary module interface file Interface.cppm.

I will analyze the first two code snippets that caused multiple definition errors in my original question and then answer why it worked in the third code snippet.

My first snippet was

//Interface.cppm
module;

#include <cstdint>

export module Interface; 

import <algorithm>;
import <iostream>;
import <memory>;
import <sstream>;
import <string>;
import <tuple>;
import <type_traits>;
import <vector>;

//Code that this interface exports and
//implementation details.


//main.cpp
import Interface;
import <iostream>;
import <memory>;
import <string>;

int main(){
    //Using the contents of Interface module
}

Here Interface.cppm imports multiple standard library headers as header units and main.cpp imports some of these header units again. One of the problem is with import <sstream> and import <string>. Here the header file <sstream> has a #include <string> and I am importing <string> again. The standard library header <string> includes other standard library headers and some internal implementation headers as well like exception, compare and so on. The multiple definition errors that I get are for these. Also <sstream> and <iostream> directly include common headers like <ios>, <istream> and <ostream>. These resulted in the other major chunk of redifinition errors. There are other issues as well like for example with <vector> and <string> both including <initializer_list> and many more such.

Essentially the same problem happens in the 2nd code snippet here:

//Interface.cppm
module;
#include <cstdint>
export module Interface;
export import <iostream>;
export import <memory>;
export import <string>;
import <algorithm>;
....


//main.cpp
import Interface;

int main(){
    //Code using the Interface
}

Here the only change is that Interface.cppm re-exports some of the imported header units so that main doesn't have to import them. But the fact that Interface.cppm imports header units <sstream> and <string> will still be a problem which will result in multiple redefinition errors and this problem doesn't get solved.

However in this 3rd snippet here:

//Interface.cppm
module;
#include <algorithm>
#include <cstdint>
#include <iostream>
...
export module Interface;
//Code that this interface exports and
//implementation details.

//main.cpp
import Interface;
import <iostream>;
import <memory>;
import <string>;
int main(){
   //Code that uses the Interface module
}

there are no redefinition errors. This is because here Interface.cppm does not use imports but uses includes in the global module fragment and the include guards come into play here and prevent multiple inclusions.

Within main.cpp however I have 3 imports i.e. of iostream, memory and string.

I wanted to see why these 3 imports of header units didn't cause multiple definition errors and I dug into the libc++ (the standard library that I was using) code.

Except for files called version, __assert, __config and some additional implementation defined header files like <__memory/allocate_at_least.h>, they did not have anything in common unlike the other header units in Interface.cppm. I wasn't including/importing any of these files in main.cpp directly and hence there were no collisions.

Now I found out why my code worked or why it didn't work but still the other questions that I had remain unanswered. I also have a new question related to importing header units which may result in multiple definition errors - how should I then go past this? I will ask these in a new question.

halfer
  • 19,824
  • 17
  • 99
  • 186
user17799869
  • 125
  • 9
1

Edit 3: After some reading I'm pretty confident I got something wrong about macros, original text still available (this edit is after the answer was accepted), changes are marked with "edit 3", locations:

  • in the "preamble"
  • the question about exported imports of header-units, and macros those headers contain (and if they should affect the importer)

Edits in two locations:

  1. near the bottom, about the effects of importing a header-unit on the importer
  2. at the bottom, about OP's follow-up question on dealing with importing header-units that will trigger redefinition errors


The C++20 standard (N4868) describes the effect of importing a module or header-unit in terms of importing a TU, so it might be worthwhile to have a minimal model of what importing a TU means. [module.import] is fairly terse about that and mostly explains how you can build a DAG of modules to figure out how much a single module import will actually "import", and what transformation you apply to a header/source file to produce the header-unit/TU that ends up being imported. There is however a (non-normative) note on the intended behaviour:

[Note 1: Namespace-scope names exported by the imported translation units become visible ([basic.scope.namespace]) in the importing translation unit and declarations within the imported translation units become reachable ([module.reach]) in the importing translation unit after the import declaration. — end note]

So essentially you produce a TU in some way, then the effect of importing should be understandable through visibility and reachability. A "problem" with that description is that we've left out macros. As per [cpp.import], macros should only be imported when you import a header-unit (note that importing a module can lead to importing a header-unit, for example if you import a module that does export import "some_header_with_macros.h" edit3: not "false" but misleading in this context, importing a module does not lead to importing macros, even if that module exports-imports a header-unit). The formal phrasing for that specifies when certain macro directives are "active" or "inactive".


I am trying to understand what happens when we include header files in the global module fragment. What happens to code that imports such a module?

I am tempted to say "nothing except exposing some declaration to the TU". In [module.global.frag] there is a definition for a declaration being decl-reachable from another declaration. This concept is then built upon to define discarded declarations from the global module fragment. And you have a note stating this:

[Note 2: A discarded declaration is neither reachable nor visible to name lookup outside the module unit, nor in template instantiations whose points of instantiation ([temp.point]) are outside the module unit, even when the instantiation context ([module.context]) includes the module unit. — end note]

This a priori implies that declarations that are not discarded can be visible and/or reachable. I think I understand why reachability is required, but as of now I do not see a context in which any declaration in the global module fragment should become visible to the importer.

As for macro directives, they should be visible/active in the TU that contains the global module fragment. In [module.global.frag], the following note

[Note 1: Prior to phase 4 of translation, only preprocessing directives can appear in the declaration-seq [of the global module fragment] ([cpp.pre]). — end note]

suggests to me that the normal phase of translation occurs to the TU that contains a global module fragment, so any macro in there would be expanded in the entire TU, not just the part of the TU that consists of the global module fragment. I also believe that none of the macros you retrieve via the global module fragment should ever propagate to importers of the module, because importing a macro is only done when you import the header-unit that defines the macro, and a module-unit isn't a header-unit.


How would the above case be different from a module that exports imported header units like export import <iostream>?

The main difference should be the export, since that affects the visibility of everything you've imported, and that the global module fragment isn't specified to export any of the declarations that it brings in. An exported import is however specified to be transferred/to impact the importer of the current module, as per [module.import]:

When a module-import-declaration imports a translation unit T, it also imports all translation units imported by exported module-import-declarations in T; such translation units are said to be exported by T.


In the case of an exported header unit, would the macros in the header unit affect any headers that are included in code that imports this module?

Edit 3: I am strongly convinced this answer is wrong, see further edit after the original answer

Assuming

import A;       // imports some macro FOO
                // (A exports a module-import-declaration that designates a
                // header-unit that defines the macro FOO)
import B;       // uses some header/header-unit that could be impacted by FOO
#include "C.h"  // has some declarations that could be impacted by FOO 

then B should not be impacted by A, but C.h should.

To justify that claim, I think there are two relevant quotes, one is how the import directive works [cpp.import]:

An import directive matching the first two forms of a pp-import instructs the preprocessor to import macros from the header unit ([module.import]) denoted by the header-name.

[...]

In all three forms of pp-import, the import and export (if it exists) preprocessing tokens are replaced by the import-keyword and export-keyword preprocessing tokens respectively. [Note 1: This makes the line no longer a directive so it is not removed at the end of phase 4. — end note]

the other would be what phase 4 of the translation process does [lex.phases]:

Preprocessing directives are executed, macro invocations are expanded, and _­Pragma unary operator expressions are executed. [...] A #include preprocessing directive causes the named header or source file to be processed from phase 1 through phase 4, recursively. All preprocessing directives are then deleted.

So before having to process the inclusion of C.h you should be in a state akin to

import-keyword A;
// preprocessor magic ensuring that macros imported from A are active
import-keyword B;
// preprocessor magic ensuring that macros imported from B are active
#include "C.h"

The inclusion of C.h should then be resolved "as per usual", impacted by the imports above it, whereas module B doesn't even know anything about its importer preprocessor status.

Edit 3: my new answer and what I got wrong above

After some (re-)reading through the standard, I'm fairly certain the above interpretation is erroneous, despite matching the behaviour of some implementation I've tested.

Includes and macro expansions are all resolved during translation phases 1 through 4. Importing a macro must also be done during those phases 1 through 4. The only import directives that import a macro are "the first two forms of a pp-import", both of which denote a header-name. So in other words, the only import directives that trigger a macro import, are the import directives that import a header-unit. The import directive to import a module, is "the third form of a pp-import", and that third form does not import macros.

So in the example above, neither B nor C.h should be impacted by A. Prior to handling the include of C.h, the translation unit should be in a state akin to

import-keyword A;
import-keyword B;
#include "C.h"

Specifically no macro is imported. The inclusion of C.h should then be resolved "as per usual", so without any influence from A/B in terms of macros.

If instead of importing a module A, we were importing a header-unit formed from some header A.h, then the import directive would match one of the "first two forms of a pp-import", so macros would be imported, and those macros would influence how the preprocessor handles the inclusion of C.h.

In both cases, module B doesn't know anything about its importer's preprocessor status.

One major source of confusion (for me) was this quote from [module.import]:

When a module-import-declaration imports a translation unit T, it also imports all translation units imported by exported module-import-declarations in T;

I initially interpreted this to mean that if you import a module, you recursively import the exported imports of header-units, leading to some "hidden" macro importing. What I failed to notice is that [module.import] explains the effect of module-import-declarations, which are introduced by the import-keyword, and that these module-import-declarations are not at all the same thing as an import-directive:

  • An import-directive is handled by the preprocessor, so during translation phases 1 through 4. An import-directive can change the preprocessor state, and that's why it is able to import macros. The import-directive is also the only way in which you can produce an import-keyword token (and hence obtain a module-import-declaration). An import-directive does not have any recursive behaviour.
  • A module-import-declaration is not handled by the preprocessor, it is a priori handled in translation phase 7, so well after the preprocessor has done its job. In particular, all macros and directives have already been handled/expanded. A module-import-declaration has some recursive behaviour as explained in [module.import] and quoted above.

So "importing" inside of a translation-unit is handled in two big steps. The import-directive handles macros in the case of header-units, and leaves behind an import-keyword in all cases. The import-keyword is like a marker so that later phases of translations will import other TUs, and be affected in terms of visibility/reachability.


Also if a module just imports a header unit without exporting it, how does code that import such a module get affected? Do the header units impact the code importing the module? If no, why does the first code snippet in my question throw so many linker errors saying ODR is violated in standard library components?

Well you have pretty much already answered that question in your own answer. Anything you import (not just header-units but also other modules and other partitions within a module) will still at a minimum affect what declarations /definitions are visible, and if those are subject to ODR, like class definitions, you can end up with invalid TUs. Header-units are more susceptible to that in a way you made me discover because header guards/pragma once cannot be enforced, because imported modules were sort of designed to not influence other imported modules, and be import-order independent, and be processable in advance before being imported (in short, they were actually designed to be modular).

Edit 1: I feel like what you did shouldn't even trigger ODR-violations/redefinition errors, and that what I just wrote in the paragraph above shouldn't even matter/isn't how things should work.

Importing a header-unit isn't specified like an include directive. An include directive is specified like a "copy-paste". An import directive is specified to produce an import-keyword leading to "importing a TU", which affects what declarations are visible/reachable. So when importing you do not "copy-paste" anything, and you shouldn't be "redefining" anything, you should just gain access to more declarations.

In a module-only code, conflicting declarations/redefinitions from different module-units can be checked, because each module-unit is clearly "identified/named": you can track the module-unit that introduced a certain declaration, and see if a different module-unit introduced a conflicting declaration. If the same declaration from the same module-unit becomes visible via multiple different "import path", it doesn't really matter, you are guaranteed that it is the same declaration.

Because I regard the import of header-unit as a compatibility feature, and that there are already some constraints regarding what kind of headers you can import as a header-unit ([module.import]: A header unit shall not contain a definition of a non-inline function or variable whose name has external linkage.), it does not sound too far-fetched to me that an implementation would attempt to track the filenames that introduced a declaration, and use that filename to disambiguate conflicting declarations. Arguably not all header-based libraries could work with that mechanism, but the set of importable headers is implementation-defined, so I assume each implementation would be allowed to impose constraints on what kind of header structure would be allowed.

I did some limited testing, and this seems to be how Visual Studio 17.3.6 deals with the issue. For example this will error:

// A.h
#ifndef A_H
#define A_H
struct Foo {};
#endif

// B.h
#ifndef B_H
#define B_H
struct Foo {};
#endif

// main.cpp
import "A.h";
import "B.h";
int main()
{
    Foo f;
}

But this will not:

// Foo.h
#ifndef FOO_H
#define FOO_H
struct Foo {};
#endif

// A.h
#ifndef A_H
#define A_H
#include "Foo.h"
#endif

// B.h
#ifndef B_H
#define B_H
#include "Foo.h"
#endif

// main.cpp
import "A.h";
import "B.h";
int main()
{
    Foo f;
}

If you've made it all the way down here, a little warning/disclaimer about the above. If I failed to make it obvious enough, this answer is based on my reading and interpretation of the C++20 standard, and I make no claim that I actually know how to read and interpret said standard correctly.

With that said, I wanted to go back to your very first question about how the global module fragment works. I like to think about the global module fragment as a form of (limited) "inline" header-unit that is imported but not exported. That is, if

  • you create a header specifically for the current module-unit,
  • put everything from the global module fragment inside that specific header,
  • import that header as a header-unit at the beginning of the current module-unit,

then I think you would mostly achieve the same effect as using the global module fragment:

  • declarations found in that fictitious header-unit would become visible/reachable in the module-unit
  • those declarations shouldn't become visible to importers of the module-unit
  • those declarations are in the purview of the global module
  • macros from the fictitious header-unit would become active in the module-unit
  • those macros shouldn't become active in the importers of the module-unit

Edit 2

I also have a new question related to importing header units which may result in multiple definition errors - how should I then go past this?

As I mentioned a little bit earlier, I feel like this is an implementation thing, but all the same implementations may not all behave the same with regards to importing header-units and that's an annoying constraint. I feel like your best shot at portable code is either:

  • not to import header-units and use the global module fragment, or
  • group up all headers that might trigger redefinitions issues into one intermediate header, and import that
N.Bach
  • 431
  • 3
  • 4
  • Thank you so much. I am going through this reply along with the references in the standard. Will get back to you with further questions if any. Thank you once again. – user17799869 Oct 17 '22 at 07:23
  • @user17799869 after rereading a couple of parts in my answer and the standard, I'm fairly certain I got something wrong with regards to how/when macro propagates. When I have time I'll throw in an edit (I'll leave all the content of the current answer accessible since you already accepted it) and ping you in comments when I do. – N.Bach Oct 18 '22 at 11:02
  • @user17799869 finally got some time to write it up, after literal days... – N.Bach Oct 23 '22 at 10:20
-2

I am trying to understand what happens when we include header files in the global module fragment. What happens to code that imports such a module?

Your application can only have one definition of various libraries. If you import them in the module, you should not in main.

For example, in the import documentation they import iostream only once, in the helloworld module and not in main. As a rule, import once and include elsewhere as needed.

How would the above case be different from a module that exports imported header units like export import ?

You should not export that way, your export of Interface will export the imported features.

In the case of an exported header unit, would the macros in the header unit affect any headers that are included in code that imports this module?

You may need to use #include for some header units if you need header definitions.

Also if a module just imports a header unit without exporting it, how does code that import such a module get affected? Do the header units impact the code importing the module? If no, why does the first code snippet in my question throw so many linker errors saying ODR is violated in standard library components?

I believe this is answered by the previous answers in this post.

James Risner
  • 5,451
  • 11
  • 25
  • 47
  • This is not correct. The example in cppreference does not apply here because the code in main does not use any of the features from iostream. If for ex, you need to use std::cout in main.cpp, and you don't include or import it in main.cpp, then the compiler will complain. The other way around this would be to export import from the module interface which would make the declarations from iostream available in main. – user17799869 Oct 07 '22 at 12:01
  • I am saying that your answer is wrong in the sense that you don't have a clear picture of their use just like I don't. The example in CPPReference is for a very simple use case. Real code is seldom like that. Take the example in CPPReference itself. Now try adding a std::cout << "Hi" << std::endl; in their main.cpp file and see if your code compiles. My code is much mroe complicated and the main translation unit requires includes/imports of its own unless these imports are passed on transitively from the module from which they are being imported. – user17799869 Oct 07 '22 at 12:08
  • HI...I request you to please delete your answer. This would in general prevent others from answering this question as well. I dont have the reputation to downvote this answer or do anything else except to ask you to delete this answer so that the queston shows up as unanswered in the bounty section. – user17799869 Oct 10 '22 at 09:15