1

0.c

#include <stdio.h>

struct test{
 int a;
};

struct test get(int in);

int main(){
 struct test t = get(1234);
 printf("%d\n",t.a);

 return 0;
}

1.c

struct test{
 char a;    // Note this is char instead of int.
};

struct test get(int in){
  struct test t = {in};
  return t;
}

struct test has two different definitions. One with int and the other with char as its data type.

Is this Undefined Behaviour? C doesn't officially have the one dentition rule like C++ and this post says multiple definitions are ok? Are different translation units allowed to define structures with the same name?

Dan
  • 2,694
  • 1
  • 6
  • 19
  • 1
    It's most definitely UB. Maybe a strict aliasing violation? – ikegami Oct 18 '21 at 18:04
  • Intuitively I would guess that this is fine if the struct has internal linkage, but could be problematic with external linkage because the linker may generate the same symbol name between the two translation units. – 0x5453 Oct 18 '21 at 18:04
  • Is there something unclear about the linked question? The accepted answer there seems pretty straightforward to me, and the situation posed in the question is nearly identical to this one. – 0x5453 Oct 18 '21 at 18:09
  • The linked question says this is ok, this doesn't look ok at all. – Dan Oct 18 '21 at 18:11
  • 1
    The linked question says this is ok, this doesn't look ok at all. More specially, i'm using the the result of one TU in another. – Dan Oct 18 '21 at 18:17
  • 2
    Using the result in another TU and having two separate definitions are two completely different things. If a definition of a return type is local to the TU the function is defined in, it cannot be used outside that TU, unless the same type is defined outside and is fully compatible. – Eugene Sh. Oct 18 '21 at 18:21
  • Right so you can have different definitions so long you use them within that TU, else it's UB as expected? it's not ODR so what rule would it be. – Dan Oct 18 '21 at 18:31
  • @0x5453, in C, struct definitions have *no* linkage. Only function and object (C sense) identifiers can have external or internal linkage in C. – John Bollinger Oct 18 '21 at 18:46

3 Answers3

5

Is this Undefined Behaviour?

Yes.

and this post says multiple definitions are ok?

Sure, but not when they are used like in between both TUs.

The problem is not even at struct test t = get(1234);, but before, just at the function declaration struct test get(int in);. Here's the rule 6.2.7:

2 All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined.

From 6.7.6.3p15:

For two function types to be compatible, both shall specify compatible return types [...]

And from 6.2.7p1:

[...] two structure [...] declared in separate translation units are compatible if their tags and members satisfy the following requirements: [...] If both are completed anywhere within their respective translation units, then the following additional requirements apply: there shall be a one-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types.

File 0.c declares function get at struct test get(int in); and file 1.c declares functions get at struct test get(int in){...} (definition is also a declaration). Both have the same name, so they refer to the same function.

Both have the return type struct test. In 0.c struct test has a member of type int, and in 1.c struct test has a member of type char. Types of the first pair of members of both structures in both files are not compatible, so types struct test are not compatible, so function declarations are not compatible, so the behavior of code is undefined.

KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • I see so if each TU uses the structs in its own file then it should fine? to confirm this still will not be ok in C++? – Dan Oct 18 '21 at 18:41
  • 1
    `if each TU uses the structs in its own file then it should fine?` Well, these are vague terms, like "on its own" or "uses". You don't have to use function `get` in `main()`, just sole declaration makes the program undefined. It should be fine for symbols with no linkage (stuff defined like inside functions, like `func() { int here; }`) and for symbols with static linkage (defined with `static`). `will not be ok in C++?` I do not know what "this" refers to, but for sure in C++ return types of functions have to be compatible in C. – KamilCuk Oct 18 '21 at 18:45
2

Is this Undefined Behaviour?

Yes.

C doesn't officially have the one dentition rule like C++ and this post says multiple definitions are ok? Are different translation units allowed to define structures with the same name?

C does not have the exactly the same one-definition rule that C++ has, but it does have similar rules. There can be at most one external definition of each identifier with external linkage anywhere in the program. There can be at most one definition of an identifier with internal linkage in any given translation unit. There can be at most one definition of an identifier with no linkage in the same namespace within the scope of that identifier, except that typedefs may be redefined identically.

The answer you referred to explains that because structure type declarations have no linkage, you can re-use structure tags in different translation units, but that is not sufficient for the needs of the code presented in this question.

Here, in order for the behavior of the program to be well defined, all declarations of function get() anywhere within must be compatible with each other (C17, 6.2.7/2). Declarations include definitions, so that means that the type specified by get() in its prototype in 0.c must be compatible with its definition in 1.c. "Compatible" is a defined term in C, covered by section 6.2.7 of the language specification, including several others by reference.

The rules for function declarations are in paragraph 6.7.6.3/15. The most relevant provision is the very first sentence:

For two function types to be compatible, both shall specify compatible return types.

We go back to 6.2.7/1 for the definition of structure type compatibility:

two structure, union, or enumerated types declared in separate translation units are compatible if their tags and members satisfy the following requirements: If one is declared with a tag, the other shall be declared with the same tag. If both are completed anywhere within their respective translation units, then the following additional requirements apply: there shall be a one-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types; [...] and if one member of the pair is declared with a name, the other is declared with the same name. For two structures, corresponding members shall be declared in the same order.

Your two struct test types are (perforce) defined with the same tag, and their members have the same names in the same order, but their corresponding members do not have compatible types (details left as an exercise). As a result, the declaration of function get() in 0.c is not compatible with the definition in 1.c, and therefore the program has undefined behavior -- and still would even if get() were never called.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • Thanks for the information. I like to ask the same question as the other answer. if each TU uses the structs in its own file then it should fine in C, but to confirm this still will not be ok in C++ due its ODR? – Dan Oct 18 '21 at 18:50
  • @Dan, it does not produce UB in C for incompatible structure types with the same tag to be defined in different translation units. Each translation unit may declare and access objects of whichever one is defined in that TU without for that reason producing UB. The same is not true in C++, where multiple definitions of a class type (which includes structure types) produce UB if they do not all consist of the same sequence of tokens. – John Bollinger Oct 18 '21 at 18:59
  • Thanks for clarifying. Can I also ask this, the rule you mention covers this as we well?: `file 0.c`: `int get(int in);` and then in `file 1.c`: `char get(char in){ /* definition */}`, they have different input types and return types. This would be UB for the same reason as you mentioned? – Dan Oct 18 '21 at 19:26
0

The problem with such a code in C is that the compiler only stores external names of functions without providing information about their parameters and return types. So the both translation units do not see the declarations of the function in each other. They only provide the external name of the function get. And the linker thinks that the definition of the function corresponds to the both names get. But the referred structure types in the declarations and the definition are different and not compatible. So the program has undefined behavior.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335