The C++ reference has the following explanation for unions, with the interesting part for this question in bold:
The union is only as big as necessary to hold its largest data member. The other data members are allocated in the same bytes as part of that largest member. The details of that allocation are implementation-defined, and it's undefined behavior to read from the member of the union that wasn't most recently written. Many compilers implement, as a non-standard language extension, the ability to read inactive members of a union.
Now, if I compile on Linux Mint 18 with g++ -std=c++11
the following code, I get the following output (given by comments next to the printf
statements):
#include <cstdio>
using namespace std;
union myUnion {
int var1; // 32 bits
long int var2; // 64 bits
char var3; // 8 bits
}; // union size is 64 bits (size of largest member)
int main()
{
myUnion a;
a.var1 = 10;
printf("a is %ld bits and has value %d\n",sizeof(a)*8,a.var1); // ...has value 10
a.var2 = 123456789;
printf("a is %ld bits and has value %ld\n",sizeof(a)*8,a.var2); // ...has value 123456789
a.var3 = 'y';
printf("a is %ld bits and has value %c\n",sizeof(a)*8,a.var3); // ...has value y
printf("a is %ld bits and has value %ld\n",sizeof(a)*8,a.var2); //... has value 123456789, why???
return 0;
}
On the line before return 0
, reading a.var2
gives not the ASCII decimal of the 'y'
character (which is what I expected, I'm new to unions) but the value with which it was first defined. Based on the above quote from cppreference.com, am I to understand that this is undefined behaviour in the sense that it is not standard, but rather GCC's particular implementation?
EDIT
As pointed out by the great answers below, I made a copying mistake in the comment after the printf
statement just before return 0
. The correct version is:
printf("a is %ld bits and has value %ld\n",sizeof(a)*8,a.var2); //... has value 123456889, why???
i.e. the 7 changes to an 8, because the first 8 bits are overwritten with the ASCII value of the 'y'
character, i.e. 121
(0111 1001
in binary). I'll leave it as it is in the above code to stay coherent with the great discussion that resulted from it, though.