18

Given the code:

struct s1 {unsigned short x;};
struct s2 {unsigned short x;};
union s1s2 { struct s1 v1; struct s2 v2; };

static int read_s1x(struct s1 *p) { return p->x; }
static void write_s2x(struct s2 *p, int v) { p->x=v;}

int test(union s1s2 *p1, union s1s2 *p2, union s1s2 *p3)
{
  if (read_s1x(&p1->v1))
  {
    unsigned short temp;
    temp = p3->v1.x;
    p3->v2.x = temp;
    write_s2x(&p2->v2,1234);
    temp = p3->v2.x;
    p3->v1.x = temp;
  }
  return read_s1x(&p1->v1);
}
int test2(int x)
{
  union s1s2 q[2];
  q->v1.x = 4321;
  return test(q,q+x,q+x);
}
#include <stdio.h>
int main(void)
{
  printf("%d\n",test2(0));
}

There exists one union object in the entire program--q. Its active member is set to v1, and then to v2, and then to v1 again. Code only uses the address-of operator on q.v1, or the resulting pointer, when that member is active, and likewise q.v2. Since p1, p2, and p3 are all the same type, it should be perfectly legal to use p3->v1 to access p1->v1, and p3->v2 to access p2->v2.

I don't see anything that would justify a compiler failing to output 1234, but many compilers including clang and gcc generate code that outputs 4321. I think what's going on is that they decide that the operations on p3 won't actually change the contents of any bits in memory, they can just be ignored altogether, but I don't see anything in the Standard that would justify ignoring the fact that p3 is used to copy data from p1->v1 to p2->v2 and vice versa.

Is there anything in the Standard that would justify such behavior, or are compilers simply not following it?

supercat
  • 77,689
  • 9
  • 166
  • 211
  • If code was `unsigned x` instead of `unsigned short x`, do you see the same problem? – chux - Reinstate Monica Sep 13 '17 at 21:22
  • @chux: Yes. An earlier version of the code also tested copying the bytes of an object to two variables of type `unsigned char` and then writing them back (which compilers don't support either) and it was more convenient to do that with two bytes than four. The problem is that the compiler completely optimizes out the operations on `p3` and loses the aliasing-related information provided thereby. – supercat Sep 13 '17 at 21:31
  • I suspected `unsigned` would fail in a like-wise manner as `unsigned short`. With `unsigned`, we can set aside any of the _usual promotions_ issues - which _shouldn't_ affect this,. – chux - Reinstate Monica Sep 13 '17 at 21:35
  • @chux: While `unsigned short` could promote as either `int` or `unsigned`, coercion of values 32767u and below to `int` is fully defined by the Standard on all implementations. – supercat Sep 13 '17 at 21:56
  • An example of potential "strict aliasing" issue without a type pun! – curiousguy Sep 20 '17 at 03:23
  • 1
    @curiousguy: I just posted another nasty one which involves reordering of memory *writes*, rather than ordering of reads vs writes. The latter one I found particularly curious because it doesn't involve a compiler optimizing out a read and write that should have forced the order of some other reads and writes, but was adjusted so that what should be a conditional write gets turned into an unconditional write with a conditionally-chosen value. – supercat Oct 05 '17 at 18:37
  • You might also be interested in compiling the program with `-fsanitize=undefined` and see what UBSan alerts to. You have to run your program with its test data because UBSan is a realtime checker. It does not produce false positives. – jww Oct 29 '17 at 08:47

4 Answers4

9

I believe that your code is conformant, and there is a flaw with the -fstrict-aliasing mode of GCC and Clang.

I cannot find the right part of the C standard, but the same problem happens when compiling your code in C++ mode for me, and I did find the relevant passages of the C++ Standard.

In the C++ standard, [class.union]/5 defines what happens when operator = is used on a union access expression. The C++ Standard states that when a union is involved in the member access expression of the built-in operator =, the active member of the union is changed to the member involved in the expression (if the type has a trivial constructor, but because this is C code, it does have a trivial constructor).

Note that write_s2x cannot change the active member of the union, because a union is not involved in the assignment expression. Your code does not assume that this happens, so it's OK.

Even if I use placement new to explicitly change which union member is active, which ought to be a hint to the compiler that the active member changed, GCC still generates code that outputs 4321.

This looks like a bug with GCC and Clang assuming that the switching of active union member cannot happen here, because they fail to recognize the possibility of p1, p2 and p3 all pointing to the same object.

GCC and Clang (and pretty much every other compiler) support an extension to C/C++ where you can read an inactive member of a union (getting whatever potentially garbage value as a result), but only if you do this access in a member access expression involving the union. If v1 were not the active member, read_s1x would not be defined behavior under this implementation-specific rule, because the union is not within the member access expression. But because v1 is the active member, that shouldn't matter.

This is a complicated case, and I hope that my analysis is correct, as someone who isn't a compiler maintainer or a member of one of the committees.

Myria
  • 3,372
  • 1
  • 24
  • 42
  • I think the fundamental problem which seems to hit a lot of compilers is that their intermediate code lacks instructions that would force a compiler to act as though it might access to any arbitrary object of type T, without the code actually performing such an access. Having a compiler generate machine code which actually read and wrote via `p3` would be silly, but I don't think the compilers' intermediate code has any other way of expressing the Standard-mandated semantics. – supercat Sep 13 '17 at 21:09
  • If code does something with p3 that would require a compiler to actually perform the accesses, the compilers will have no problems recognizing the aliasing. The problem arises if the compiler decides to optimize code which shouldn't generate any machine instructions but would affect objects' effective types or active members. Similar problems occur when using `char`-type accesses. Proponents of strict aliasing claim the solution to aliasing issues is to use character-pointers or unions, but that won't help if compilers optimize out such things. – supercat Sep 13 '17 at 21:19
  • @supercat you can find many of those optimisation traps examples in the net. – 0___________ Sep 13 '17 at 21:56
  • 2
    @PeterJ_01: From what I've seen, most such examples involve code which might, using a sufficiently-twisted interpretation of the Standard, be viewed as invoking UB. My goal here is to have an example whose behavior is clearly, unambiguously, and undeniably defined by the Standard. – supercat Sep 13 '17 at 22:02
  • @supercat you have decided to allow optimisations, you should understand the effects, or compile with the optimistions disabled – 0___________ Sep 13 '17 at 22:13
  • @PeterJ_01: The authors of gcc claim that any code which fails because of their optimizations is "broken", and cite the Standard as justification for that view. – supercat Sep 13 '17 at 22:46
  • @supercat if you think it the optimiser bug just report it. You can also stop using gcc & clang – 0___________ Sep 13 '17 at 22:56
  • 2
    @PeterJ_01: It's not just gcc and clang. Many compilers on godbolt behave the same way, which would suggest that the behavior is by design, which makes me wonder if compiler writers are interpreting the Standard in such a way as to justify their behavior. – supercat Sep 13 '17 at 23:27
  • @supercat Have you filed a GCC and Clang bug for this? If you don't want to, I could do that. – Myria Sep 14 '17 at 22:01
  • @Myria: Please do. The problem goes far beyond those. Of all the compilers I tested on godbolt with type-based aliasing enabled (for some I couldn't figure out how to enable it), only icc worked correctly. – supercat Sep 14 '17 at 22:07
  • @Myria I uv'd as your argumentation makes sense, but yet I have doubts when not referring to the **C** standard. See my answer for an attempted interpretation that's consistent with observed compiler behavior (up for discussion) :) –  Sep 15 '17 at 14:18
  • @supercat I filed bugs with GCC and Clang about this, using your code. – Myria Sep 15 '17 at 22:01
  • 1
    @Myria: Incidentally, while the particular example was contrived to show the problem as simply as possible, the problem could plausibly occur in real-world code. For example, if code examines an array element, uses a loop to convert everything in the array to another type with the same representation, acts upon some things as that new type, and then uses another loop to convert everything back, the loops that do the conversion may get optimized out. I think the problem is that the Standard describes Effective Types in terms of objects rather than lvalues, but there's no way... – supercat Sep 17 '17 at 21:17
  • ...that gcc or clang can represent the notion of "This code changes the Effective Type of an X to Y, and that X might, but need not be, the same X identified by some other lvalue" except by physically reading the object as type X and writing it as type Y. – supercat Sep 17 '17 at 21:23
  • @Myria: Have you heard anything in reply to either bug report? – supercat Sep 30 '17 at 17:44
  • @supercat https://bugs.llvm.org/show_bug.cgi?id=34632 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82224 – Myria Oct 04 '17 at 02:22
  • @Myria: I'm puzzled by why the people on those lists find certain things no difficult. With regard to the Common Initial Sequence rule, for example, say that any access to a struct which is part of a complete union type declaration *which is visible at the point of access* is presumed capable of accessing CIS members of other types in that union. I'm not sure why people think that would require knowing every type that anyone might declare anyplace. – supercat Oct 04 '17 at 05:18
  • @Myria: As for the specific issue, I think the best remedy would be to have the Standard recognize distinct aliasing modes, one of which would make Effective Types permanent and don't allow union members to have their address taken (specify `[]` as operators that can act on arrays without them decomposing into pointer types), one of which would require that all accesses be treated as `volatile`, and some of which would fall between those extremes. C is used for so many purposes that any single set of rules that tries to serve them all will, at best, serve them all badly. – supercat Oct 04 '17 at 05:23
  • @Myria: I'd enjoy sharing what suspect would, if recognized, become the most popular aliasing mode--one that would be easier to describe unambiguously than the present rules, support most code that would presently require `-fno-strict-aliasing`, and yet still allow the majority of the optimizations that are now possible and many others that *aren't*, – supercat Oct 04 '17 at 05:37
  • @Myria: I just posted another question which I think may represent a different bug in gcc (one which clang doesn't share). – supercat Oct 05 '17 at 18:38
3

With a strict interpretation of the standard, this code might be not conforming. Let's focus on the text of the well-known §6.5p7:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.

(emphasis mine)

Your functions read_s1x() and write_s2x() do the opposite of what I marked bold above in the context of your whole code. With just this paragraph, you could conclude that it's not allowed: A pointer to union s1s2 would be allowed to alias a pointer to struct s1, but not vice versa.

This interpretation of course would mean that the code must work as intended if you "inline" these functions manually in your test(). This is indeed the case here with gcc 6.2 for i686-w64-mingw32.


Adding two arguments in favor of the strict interpretation presented above:

  • While it's always allowed to alias any pointer with char *, a character array can't be aliased by any other type.

  • Considering the (here unrelated) §6.5.2.3p6:

    One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible.

    (again emphasis mine) -- the typical interpretation is that being visible means directly in the scope of the function in question, not "somewhere in the translation unit" ... so this guarantee doesn't include a function that takes a pointer to one of the structs that's a member of the union.

  • 1
    Applying the `&` operator to a structure or union member yields a pointer of that member type. Further, if a structure or union member is an array, it would be impossible to do anything with that member *except* by using a pointer of its constituent type. While I suppose one could read the Standard in such fashion as to say that applying the address-of operator to a struct or union member will yield a pointer that can't actually be used for any purpose unless it is first converted to a character type, it would seem more reasonable to allow implementations to treat the result of... – supercat Sep 15 '17 at 15:03
  • ...the `&` operator as having a type which was incompatible with any pointer type other than `void*` unless it was applied to a member of character type (in which case it would naturally yield `char*`). As it happens, though, while I haven't tested all of them on all compilers, other means of controlling Effective Types (such as reading all of the individual bytes of an object to discrete objects of type `unsigned char` and then writing all of the individual bytes of the object) also fail on gcc and clang. I think the problem is that compilers lack any concept of code that might... – supercat Sep 15 '17 at 15:06
  • ...change the Effective Type of an object, or the active member of a union, but which doesn't require generating any actual machine-code loads or stores. – supercat Sep 15 '17 at 15:06
0

I didn't read the standard, but playing with pointers in a strict-aliasing mode (ie, using -fstrict-alising) is dangerous. See the gcc online doc:

Pay special attention to code like this:

union a_union {
  int i;
  double d;
};

int f() {
  union a_union t;
  t.d = 3.0;
  return t.i;
}

The practice of reading from a different union member than the one most recently written to (called type-punning) is common. Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. So, the code above works as expected. See Structures unions enumerations and bit-fields implementation. However, this code might not:

int f() {
   union a_union t;
   int* ip;
   t.d = 3.0;
   ip = &t.i;
   return *ip;
}

Similarly, access by taking the address, casting the resulting pointer and dereferencing the result has undefined behavior, even if the cast uses a union type, e.g.:

int f() {
  double d = 3.0;
  return ((union a_union *) &d)->i;
}

The -fstrict-aliasing option is enabled at levels -O2, -O3, -Os.

Found anything similar in the second example huh?

walkerlala
  • 1,599
  • 1
  • 19
  • 32
  • 1
    Note that my example, the only takes the address of a union member after having written *that member*, and abandons that pointer before writing another member. The problem is that gcc and clang both try to apply two conflicting optimizations: omitting the code which would read one union member and then write the exact same bit pattern to another would be a fine optimization in isolation, but trips up later optimizations. What's tragic is that the authors of the C Standard didn't better state the notion that compilers may generally assume there's no aliasing *when there's no evidence of it*... – supercat Oct 27 '17 at 14:40
  • ...but that quality compilers should not be oblivious to evidence of aliasing in useful cases. While the Standard doesn't explicitly say that taking the address of a union member ("active" or not) should be recognized as evidence of aliasing, the most likely reason is that they thought it was *obvious*. If the compiler recognized the act of taking union members' addresses as evidence of aliasing, the omission of the read-write sequence wouldn't matter. – supercat Oct 27 '17 at 14:42
  • @supercat you are taking the address of a member of a union (i.e., address of a "struct s2" out of a "union s1s2"), and that is illegal according to the example provided above. – walkerlala Oct 28 '17 at 15:40
  • @supercat I read almost all comments you provided in other answer but still I am not clear what you are arguing for. What is your point? Do you think that both gcc and clang are doing the wrong thing? Or you are just blaming the C standard? – walkerlala Oct 28 '17 at 15:43
  • The behavior of clang and gcc is unambiguously non-conforming here. Further, the published Rationale for the Standard explicitly states that rather than trying to mandate everything needed to make an implementation useful, it expects that if implementations do what's required by the Standard they will naturally do the *other* things necessary to make them useful. For an implementation to efficiently uphold the Standard, it must have a way of accommodating actions that might change the active type of a union or the Effective Type of storage, without knowing whether the action... – supercat Oct 28 '17 at 16:27
  • ...would actually change it or not. An implementation with that ability could most easily uphold the Standard by simply treating address-of-union-member as having such semantics. I don't think the authors of the Standard considered the possibility that implementations might seek to avoid doing so and instead use more complicated and less useful ways of meeting the Standard's requirements. – supercat Oct 28 '17 at 16:42
-2

It is not about conforming or not conforming - it one of the optimisation "traps". All of your data structures have been optimised out and you pass the same pointer to optimised out data so the the execution tree is reduced to simple printf of the value.

  sub rsp, 8
  mov esi, 4321
  mov edi, OFFSET FLAT:.LC0
  xor eax, eax
  call printf
  xor eax, eax
  add rsp, 8
  ret

to change it you need to make this "transfer" function to be side effect prone and force the real assignments. It will force optimizer to not reduce those nodes in the execution tree:

int test(union s1s2 *p1, union s1s2 *p2, volatile union s1s2 *p3)
/* ....*/

main:
  sub rsp, 8
  mov esi, 1234
  mov edi, OFFSET FLAT:.LC0
  xor eax, eax
  call printf
  xor eax, eax
  add rsp, 8
  ret

it is quite trivial test just artificially made a bit more complicated.

0___________
  • 60,014
  • 4
  • 34
  • 74
  • 1
    Certainly making the code more complicated would result in compilers processing it correctly. I see nothing in the Standard that would allow conforming compilers to require such useless code, however, and question the sanity of an optimizer requiring programmers to add code they know to be useless. – supercat Sep 13 '17 at 21:59
  • 1
    @supercat using this login any optimisation which changes or reduces the code (for example not calling the function) is not conforming. This is the trap - all local variables, no use of any of them etc. Writing the code this way and allowing the optimisation, programmer should take into considerations those effects. – 0___________ Sep 13 '17 at 22:06
  • Change your code to: `int test(union s1s2 *p1, union s1s2 *p2, union s1s2 *p3) { if (read_s1x(&p1->v1)) { unsigned short temp; temp = p3->v1.x; p3->v2.x = temp; write_s2x(&p2->v2,1234); temp = p3->v2.x; p3->v1.x = temp; printf("%d\n",p3->v2.x); } return read_s1x(&p1->v1); }` – 0___________ Sep 13 '17 at 22:08
  • The program I posted was deliberately contrived to "bait" compilers into making illegitimate optimizations, but that doesn't make their optimizations conforming. If compiler writers wanted to specify that certain optimizations make code non-conforming, and accept that incompatibility with such optimizations doesn't mean code is "broken", but merely means that the code and optimizer aren't suited for each other, that would be fine, though a better compiler should try to maximize compatibility with existing code. – supercat Sep 13 '17 at 23:21
  • @supercat I know that is the intentional abuse :). No one sane would not write something like this. What is the actual reason of the post? Bug report? Something else. The logical answer is: ant code optimisations may have possible some side effects and the programmer should have it in mind. – 0___________ Sep 13 '17 at 23:31
  • 1
    The question is whether there is anything in the Standard that would justify the behavior which seems to be widespread, or whether the parts of the Standard relating to aliasing should be viewed as meaningless since nobody follows them anyway? – supercat Sep 13 '17 at 23:58