1

I'm trying to debug a seg fault that's occurring in my team's codebase. There's a hash_key that is defined in two files file1.cpp, file2.cpp. It is not wrapped with an anonymous namespace, so it seems to just be a global definition within the file, e.g.,

In file1.cpp, we define

struct hash_key {
  int a;
  int b;
};

// programs below use hash_key 

In file2.cpp, we define

struct hash_key {
  string s;
};

// programs below use hash_key 

(Note the hash_key defined in each file have no overlapping variable names)

I don't think anything in file1.cpp should see the contents of file2.cpp and vice versa. However, when I wrap an anonymous namespace around the hash_key in file2.cpp, the seg fault goes away, so now I'm questioning whether my understanding of multi-file C++ compilation is correct.

When the contents of file1 uses hash_key, it should be using the hash_key struct defined within that file right?

It could be something else that's causing the seg fault, but right now, I'm empirically observing that having the namespace in file2.cpp removes the seg fault issue.

roulette01
  • 1,984
  • 2
  • 13
  • 26
  • Try using an _anonymous namespace_ if that struct is only used in the `.cpp` files. The linker might get confused otherwise. – πάντα ῥεῖ Mar 22 '22 at 13:45
  • 4
    You have an ODR violation and therefore undefined behaviour, type names need to be globally unique – Alan Birtles Mar 22 '22 at 13:46
  • Doe's the ```hash_key``` structs identical? – lior.i Mar 22 '22 at 13:48
  • Hmmm I see, but the member variables in `hash_key` is different in each file (there are no overlapping variable names). So if this was indeed an ODR issue, wouldn't that cause issues during compilation? – roulette01 Mar 22 '22 at 13:48
  • @lior.i No, they're actually defined differently, which makes me confused. Let me include that in the OP – roulette01 Mar 22 '22 at 13:49
  • `file1.cpp` doesn't see contents of `file2.cpp`, but linker sees both files and is like "hmm, these two structures have the same name, sure they have the same definition as well, I can only leave one of them". – Yksisarvinen Mar 22 '22 at 13:49
  • "wouldn't that cause issues during compilation?", never during compilation, as each translation unit is fully independent of each other. But most linkers resolve with "first one wins" :-) – Klaus Mar 22 '22 at 13:49
  • @roulette01 So my guess is that one of them is accessing the wrong struct and messing with the wrong addresses – lior.i Mar 22 '22 at 13:50
  • When you violate [ODR](https://en.cppreference.com/w/cpp/language/definition), any behavior is possible. ODR violations do not require diagnostics from the compiler (giving a compiler error). – NathanOliver Mar 22 '22 at 13:51
  • If the source-file tokens are exactly the same for both structures, then it's fine. If even one token is mismatching then you have undefined behavior. For example if you have `struct hash_key { int x; };` in one file, and `struct hash_key { int y; };` in another, then the structures are not the same and you have UB. This is why defining structures in header files work: All translation units where the header file is included will have the same token-by-token exact copy of the same structure. – Some programmer dude Mar 22 '22 at 13:51
  • If a change that shouldn't cause different behaviour actually does, something somewhere in your code has undefined behaviour. – molbdnilo Mar 22 '22 at 13:51
  • In general for anything declared in a cpp file I wrap it in an anonymous namespace, prevents later surprises when something else with the same name is declared in another cpp file – Alan Birtles Mar 22 '22 at 13:54
  • @roulette01 can you run some debug tool and say where it crashes? in that way you can know what happen and who access the wrong struct – lior.i Mar 22 '22 at 13:54
  • If those structs are only inside cpp file (not in header) then just use anonymous namespace and problem solved. Old fashion way to fix is is use `static` keyword (in this context it limits symbol to one translation unit). – Marek R Mar 22 '22 at 13:58
  • 1
    I just want to make sure I understand this correctly. the linker will resolve the ODR by only including one of the definitions of `hash_key`, correct. If so, and let's assume it used the one defined in `file1`, what would happen if `file2`'s code tried to access `hash_key::s`? UB? – roulette01 Mar 22 '22 at 14:00
  • Yep, as soon as you have an ODR violation you have UB, your code could appear to work, it might crash before either structure has even been used – Alan Birtles Mar 22 '22 at 14:01
  • 2
    @roulette01 -- no, the linker doesn't make decisions about type definitions. It merges code from multiple sources and primarily deals with functions. When two functions have the same name and the same argument list but different implementations the result will typically be that the linker picks one and code that expects the other won't work right. Types (e.g., `struct hash_key`) don't generate code; the compiler uses them to decide what code to generate, and by the time the linker gets involved the code has already been generated. Nevertheless, this ODR violation results in undefined behavior. – Pete Becker Mar 22 '22 at 14:07
  • Move the `hash_key` definition to a common header file, so both translation units see the same definition. Or wrap each `hash_key` definition in the .cpp files with a `namespace { ... }` anonymous namespace so each translation unit has its own global-within-the-translation-unit understanding of `hash_key`. – Eljay Mar 22 '22 at 14:14

0 Answers0