23

This appears to be undefined behavior

union A {
  int const x;
  float y;
};

A a = { 0 };
a.y = 1;

The spec says

Creating a new object at the storage location that a const object with static, thread, or automatic storage duration occupies or, at the storage location that such a const object used to occupy before its lifetime ended results in undefined behavior.

But no compiler warns me while it's an easy to diagnose mistake. Am I misinterpreting the wording?

timrau
  • 22,578
  • 4
  • 51
  • 64
Johannes Schaub - litb
  • 496,577
  • 130
  • 894
  • 1,212
  • You don't always, or even usually, get warnings for undefined behaviour. Voting to close as you answered your own question... – BlueRaja - Danny Pflughoeft Apr 13 '11 at 17:58
  • 5
    @BlueRaja litb isn't asking "why isn't the compiler warning me", he's asking "the compiler didn't warn me is that because I misinterpreted the spec?" – JaredPar Apr 13 '11 at 18:00
  • @Blue there is no reason why a compiler wouldn't warn or error out for an easy to diagnose mistake, simply looking for const union members in an union with non-const members. Every compiler I've access to warns for `void f() { int a; ++a = ++a; }`. Also, what @JaredPar says applies :) – Johannes Schaub - litb Apr 13 '11 at 18:01
  • Where in the spec is that from. Somtimes reading the surrounding context goup helps. – Martin York Apr 13 '11 at 18:04
  • It's at 3.8[basic.life]/9 in both C++03 and the C++0x FDIS (N3290). – James McNellis Apr 13 '11 at 18:07

3 Answers3

7

The latest C++0x draft standard is explicit about this:

In a union, at most one of the non-static data members can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time.

So your statement

a.y = 1;

is fine, because it changes the active member from x to y. If you subsequently referenced a.x as an rvalue, the behaviour would be undefined:

cout << a.x << endl ; // Undefined!

Your quote from the spec is not relevant here, because you are not creating any new object.

Emil Laine
  • 41,598
  • 9
  • 101
  • 157
TonyK
  • 16,761
  • 4
  • 37
  • 72
  • But it destroys `a.x` and creates `a.y` therefor creating a new object at the storage location that a const object of automatic storage duration previously occupied (see 3.8p1, we reuse the storage of `a.x` for `a.y`). Can you please explain why it doesn't apply in more detail? – Johannes Schaub - litb Apr 13 '11 at 18:31
  • 1
    @Johannes: does `a.y = 1` end the lifetime of `a.x` by reusing the memory? If so, then the part of the spec you quote isn't violated. But I don't know one way or the other whether it does, it's a pretty fine distinction as to what it means to re-use memory. As TonyK says, the text about unions invents its own terminology "is active" in preference to talking in terms of object lifetime. – Steve Jessop Apr 13 '11 at 18:34
  • @Steve if it doesn't end the lifetime of `a.y`, then we would violate aliasing rules, because we would modify the stored value of an `int const` object by an lvalue of type `float` in this example: `union A { int x; float y; }; A a = { 0 }; a.y = 1;`. – Johannes Schaub - litb Apr 13 '11 at 18:41
  • 1
    @Johannes: Assigning to `a.y` doesn't destroy `a.x` (which has no destructor anyway) -- it just makes `a.x` invalid. It doesn't create anything either, it just makes `a.y` valid. No record is kept of which member of a union is valid at any time, so it's up to the programmer to follow the rules. – TonyK Apr 13 '11 at 18:42
  • +1 for the correct answer - OP's snippet is valid, the const member just cant be reactivated. – Erik Apr 13 '11 at 18:49
  • @Johannes: sorry, I don't know what you mean. End the lifetime of `y`? I said end the lifetime of `x`. If it does indeed end the lifetime of `x` then we are not creating a new object in the storage occupied by a const *before its lifetime ends*, and hence your code is not UB. – Steve Jessop Apr 13 '11 at 18:53
  • @Johannes: oh, or maybe I've misread. Does it mean to say that creating an object in memory that has *ever* been occupied by a const object is UB? Even if e.g. the const object was created with placement new and has been destructed? Ouch. – Steve Jessop Apr 13 '11 at 18:55
  • @Steve yes, sorry. I mean "end the lifetime of a.x". The text says also "or, at the storage location that such a const object used to occupy before its lifetime ended", which will then apply here. – Johannes Schaub - litb Apr 13 '11 at 18:56
  • 2
    @Johannes: that's brutal. So `struct Foo { const int a; }; char x[sizeof(Foo)]; Foo *p = new (x) Foo(); p->~Foo(); x[0] = 0;` is undefined behavior? `x[0]` is at the location of a const object before its lifetime ended. Or even if `x[0] = 0;` doesn't "create an object", a repeat of `new (x) Foo();` certainly does. – Steve Jessop Apr 13 '11 at 19:01
  • Oh no, hang on. `a` there has dynamic storage duration, so is exempt. Not so brutal :-) But `x` in your example has automatic duration, so isn't. – Steve Jessop Apr 13 '11 at 19:05
  • @TonyK GCC developers (some of them being committee members, like Mike Stump) interpret the rule like me: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29286 – Johannes Schaub - litb Apr 13 '11 at 19:11
  • @litb: Another committee member (James Dennett) seems to agree with TonyK's interpretation - see http://www.mail-archive.com/gcc@gcc.gnu.org/msg39958.html – Erik Apr 13 '11 at 19:29
  • @Johannes: Help me out here. You provide a link to a 3000-line bug report and expect me to find the relevant sentence? Even searching for `Mike Stump` left me with hundreds of lines. – TonyK Apr 13 '11 at 19:39
  • @Erik I cannot see him contradicting what I said. Yes, it changes the active member from `a.x` to `a.y`. But that does not mean that it doesn't "create a new object at the storage location that a const object ... used to occupy". James doesn't contradict this view, as far as I can see. – Johannes Schaub - litb Apr 13 '11 at 19:43
  • @TonyK i'm sorry. Here is a direct link to a relevant comment: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29286#c16 – Johannes Schaub - litb Apr 13 '11 at 19:45
  • @litb: I read it as one const member is ok, two or more aren't. I'll ask him :P – Erik Apr 13 '11 at 19:47
  • @Erik I agree to that. One const member is OK. I.e the following is perfectly fine: `union A { int const a; int b; } a = { 0 };`. – Johannes Schaub - litb Apr 13 '11 at 19:48
  • @litb: So you're saying that the standard contradicts itself on this? I don't see how activating a union member can be considered "creating a new object" while at the same time a const member of a union is valid (considering your original quote)? – Erik Apr 13 '11 at 19:54
  • @Johannes: I'm not surprised I didn't find your 'relevant comment'. What does it have to do with unions? – TonyK Apr 13 '11 at 21:13
  • @Erik I don't mean to say that the spec contradicts itself. I think that `a.y = 0` creates a `float` object, just like `void *p = malloc(max(sizeof(int), sizeof(float))); *(int*)p = 0; *(float*)p = 0.f;` first creates an `int` object and then creates a `float` object. I don't understand why you think that having a const member of an union allowed contradicts the notion that writing to an union member would create a new object. – Johannes Schaub - litb Apr 13 '11 at 21:45
  • @Johannes Schaub - litb: If `a.y = 0` *creates* a float it implies that the existing const int is also destroyed - which violates your original standard quote - thus if it applies you can't have a const member of a union. But if you *can* legally have a const member of a union (as circumstancial evidence seems to imply), then either "activation/deactivation" cannot be considered creation/destruction *or* the standard is contradicting itself. Or? – Erik Apr 14 '11 at 08:29
  • @Erik The new float object will occupy a storage area that a `const int` object used to occupy. And hence it will be undefined behavior. "thus if it applies you can't have a const member of a union" -> Why is that? There are situations where it doesn't apply and where we still have a const member. I gave an example above: `union A { const int x; float y; } a = { 0 };`. This is not undefined behavior. Sure if you now write to `a.y`, you do undefined behavior. But there is no spec quote contradicting that. – Johannes Schaub - litb Apr 14 '11 at 11:21
  • I think "active member" is just a nicely sounding term the spec uses to say "creating an object of the type of that member". There's no deeper meaning to it. You can alias the old object by what 3.10/15 defines for certain cases. Like `struct A { int x; unsigned char y; }; A a = { 10 }; unsigned char y = a.y;`, which is perfectly valid, albeit it looks like reading from a "non-active" member. This is what James Derret appears to refer to when he says that for some cases, type punning through an union is OK. – Johannes Schaub - litb Apr 14 '11 at 11:27
  • @Johannes Schaub - litb: Using your `union A { const int x; float y; } a = { 0 };` sample I think UB is invoked not when you write to `a.y` but if you thereafter read from `a.x` - If activation is creation, then there's no use for a union with a const member - you can't ever do anything but read the const member without invoking UB. If activation *isn't* creation, then you can at least use the union as normal, you just can't reactivate the const member after activating something else. But, this is definitely not clear from the 03 standard. – Erik Apr 14 '11 at 12:20
  • @Erik I do agree in that UB is invoked in any case if you afterwards read from `a.x` (after you wrote to `a.y`). Our disagreement is about whether UB happens earlier. This is an edge case, so I think it's fine with the majority of uses being undefined behavior. The spec doesn't *specifically* allow const members of unions in an act of "ohh they are very useful" or something. It just comes out of existing rules, and happens that most uses are undefined. – Johannes Schaub - litb Apr 14 '11 at 12:29
  • @Johannes Schaub - litb: So what the standard lacks is to be explicit about activation - is it creation or not. And yeah, this is very much an edge case - still amusing to try figuring it out though. – Erik Apr 14 '11 at 12:43
2

It doesn't really make sense to have a const member of a union, and I'm surprised that the standard allows it. The purpose of all of the many limitations on what can go into a union is to arrive at a point where bitwise assignment will be a valid assignment operator for all members, and you can't use bitwise assignment to assign to a const int. My guess is that it's just a case that no one had previously thought of (although it affects C as well as C++, so it's been around for awhile).

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • 2
    Is it that unreasonable to want a read-only member of a union? For example if I'm using the union to examine the bytes of a float one at a time, I might want `union { float f; const unsigned char b[sizeof(float)]; }`, if that had the desired/expected effect of enforcing that I don't write byte-wise, only read. Of course in that example I could just cast to `unsigned char*`, so maybe I need another one, but type-punning through unions isn't always guaranteed so it's implementation-dependant what this actually wins you. – Steve Jessop Apr 13 '11 at 18:25
  • Cast to `const unsigned char*`, I mean! – Steve Jessop Apr 13 '11 at 18:35
  • 2
    It makes perfect sense to have a `const` member of a `union`. It can be initialised when the `union` is constructed, but not later. If any other member of the `union` is subsequently assigned to, then the `const` member becomes undefined for ever. – TonyK Apr 13 '11 at 18:38
  • I'm sorry. I changed the type of the second member to `float`, to prevent any confusion that it would have something to do with `int` vs `int const` or something. – Johannes Schaub - litb Apr 13 '11 at 21:42
  • What's the point of a const member of a union? There's no point in using a union unless you're going to change it; otherwise, a variable of the original type would do the trick. – James Kanze Apr 14 '11 at 08:07
  • 1
    @JamesKanze: the point may be to implement a read-only getter. – sam hocevar Mar 06 '12 at 14:02
  • @JamesKanze: the `union` can be combined with the proxy pattern. See for instance https://gist.github.com/1987477 for a not very elegant but hopefully illustrative example. – sam hocevar Mar 06 '12 at 16:59
  • @SamHocevar I'm not sure that the code there is even legal; it certainly isn't in C++03. And even if it was, I don't see what the `union` buys you. – James Kanze Mar 06 '12 at 17:03
  • @JamesKanze: the `union` allows C# and GLSL-like getter/setter syntax (without parentheses), while ensuring compactness (because there is no need to store `this` in the proxy). Out of curiosity, does it break anything in C++03 apart from the anonymous `struct`? – sam hocevar Mar 06 '12 at 17:18
  • @SamHocevar A union cannot contain types with non-trival constructors, destructors or assignment operators. And even in C++11, I don't think that there are anonymous structs, and I'm pretty sure that if you access an element of the union other than the one last stored in it, it's undefined behavior. (So accessing `norm` if the last element stored was `x` or `y` in your code is undefined behavior.) – James Kanze Mar 06 '12 at 17:54
  • @JamesKanze: there are no non-trival constructors, destructors or assignment operators in the above code. Also, the part of the C++ standard you refer to is one of the most misinterpreted and I believe it is correct to access members of `union`s if they have the same type and address. Or are you implying that `union { int i; int j } u; u.i = 1; cout << u.j;` is UB? (besides, every compiler I know supports type-punning through `union`s, but that's another story) – sam hocevar Mar 06 '12 at 18:07
  • @SamHocevar I missed the fact that `Norm` didn't have any constructors. But it has private data members, and with no constructors, there's no way to initialize them. Officially, too, your short example is undefined behavior (although I too would be very surprised if it failed anywhere). Your case is more complicated, however---the struct with the int's and `Norm` aren't even layout compatible. – James Kanze Mar 06 '12 at 18:33
1

If it's any consolation - the Microsoft Xbox 360 compiler (which is based on Visual Studio's compiler) does error out. Which is funny, because that's usually the most lenient of the bunch.

error C2220: warning treated as error - no 'object' file generated
warning C4510: 'A' : default constructor could not be generated
    : see declaration of 'A'
warning C4610: union 'A' can never be instantiated - user defined constructor required

This error goes away if I take the const away. gcc-based compilers don't complain.

EDIT: The Microsoft Visual C++ compiler has the same warning.

EboMike
  • 76,846
  • 14
  • 164
  • 167
  • 3
    Visual C++ 2010 gives the same warning (with `/W4`). That warning (C4610) is interesting in that it is wrong: `A` can indeed be instantiated as demonstrated by `A x = { 0 };` (which Visual C++ 2010 does accept). – James McNellis Apr 13 '11 at 18:04
  • Actually, it seems to be that Visual Studio C++ doesn't like uninitialized const ints, period. I tried adding `const int x` on global scope, and that created a warning as well. Makes sense, since the compiler treats const ints as compile-time constants. – EboMike Apr 13 '11 at 18:07
  • @Johannes: True. It's just my guess that the compiler trips over the const int handling and insists that a const int needs to be initialized via a constructor or an initial assignment and doesn't consider `={0}` in unions. – EboMike Apr 13 '11 at 18:10
  • To summarize - I believe you're right, it's undefined behavior, and Visual C++ accidentally warns you about it, although not on purpose, but due to ineptitude. – EboMike Apr 13 '11 at 18:12