13
union
{ int i;
  bool b;
} x;

x.i = 20000;
x.b = true;
cout << x.i;

It prints out 19969. Why does it not print out 20000?

Stephen
  • 1,607
  • 2
  • 18
  • 40
Joe
  • 133
  • 3

4 Answers4

31

A union is not a struct. In a union, all of the data occupies the same space and can be treated as different types via its field names. When you assign true to x.b, you are overwriting the lower-order bits of 20000.

More specifically:

20000 in binary: 100111000100000

19969 in binary: 100111000000001

What happened here was that you put a one-byte value of 1 (00000001) in the 8 lower-order bits of 200000.

If you use a struct instead of a union, you will have space for both an int and a bool, rather than just an int, and you will see the results you expected.

danben
  • 80,905
  • 18
  • 123
  • 145
3

In a union, all data members start at the same memory location. In your example you can only really use one data member at a time. This feature can be used for some neat tricks however, such as exposing the same data in multiple ways:

union Vector3
{
  int v[3];
  struct
  {
    int x, y, z;
  };
};

Which allows you to access the three integers either by name (x, y and z) or as an array (v).

Niki Yoshiuchi
  • 16,883
  • 1
  • 35
  • 44
  • 5
    ...or it *might* anyway. Then again, it might insert padding between the `int`s in the struct, in which case `v[1]` and `v[2]` will NOT correspond to `y` and `z` as intended. – Jerry Coffin May 19 '10 at 19:13
  • I'm not aware of a compiler that would ever put padding between two data members of the same type. Padding comes up when consecutive data members are of different sizes such as a char followed by a short (an additional char would be packed between the two, most likely). That said, the structure might be padded so that it's total size is word-aligned, so if x, y, and z were each a byte, it's possible it would be padded as x, y, z, p or p, x, y, z (which would indeed break my example). – Niki Yoshiuchi May 19 '10 at 19:28
  • 1
    Note that neither C nor C++ have anonymous structs. So you need to put a name like `} vs;` for the struct object. Or rely on compiler extensions if you can. – Johannes Schaub - litb May 19 '10 at 19:38
3

A union only stores one of the members at any given time. To get defined results, you can only read the same member from the union that was last written to the union. Doing otherwise (as you are here) officially gives nothing more or less than undefined results.

Sometimes unions are intentionally used for type-punning (e.g., looking at the bytes that make up a float). In this case, it's up to you to make sense of what you get. The language sort of tries to give you a fighting chance, but it can't really guarantee much.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • I don't think anything you've said here is factually incorrect but I do take issue with the use of the word "undefined". "undefined" is when you read from uninitialized memory; when you have four bytes and overwrite one of them you should know exactly what you're going to have (thought it may look weird once you cast it). – danben May 19 '10 at 19:19
  • @danben, you don't even know you are overwriting one byte. The write by the boolean could overwrite some random part of the int that when you later read the int without assigning to it subsequently you could cause a cpu exception or similar (both C and C++ allows for such things). Behavior is really undefined, anything can happen. – Johannes Schaub - litb May 19 '10 at 19:40
  • I believe according to the standard that this particular behavior is implementation-defined. – danben May 19 '10 at 19:45
  • @danben: It depends on which standard you go by. C89/90 makes it undefined behavior. In C99, writing to one results in an unspecified value in the other, but is not allowed to create a trap representation in the other. In C++, doing so is implicitly undefined (i.e., no behavior is defined, so doing it gives undefined behavior). In no case is it implementation defined though (that would require the implementation to document what happens, which would be difficult and nearly pointless). – Jerry Coffin May 19 '10 at 21:07
  • @danben: as of n1256, the behavior is unspecified: J.1/1, "The value of a union member other than the last one stored into (6.2.6.1)." – John Bode May 19 '10 at 21:22
  • You are all correct; I was thinking of something else entirely. I retract my initial comment. – danben May 19 '10 at 22:04
  • I don't believe the behavior of C++ is any different than that of C99. Can you please refer to the paragraphs (please only normative text) that draw the difference? I.e i think if the other member is an unsigned char, you are allowed to read from it even tho the last write was to the other member (no trap can occur). Thanks. The non-normative text @John quoted is defective, since an unspecified value does not allow for traps, but such a member *can* have a trap representation. – Johannes Schaub - litb May 20 '10 at 10:56
  • @Johannes: C99, §6.2.6.1/7: "When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecified values, but the value of the union object shall not thereby become a trap representation." At least as I read it, since the union object can't become a trap representation, you can read any member without it being a trap representation either. – Jerry Coffin May 20 '10 at 11:17
  • @Jerry no it just says that the union object can't become a trap representation. It means that you can still do `u1 = u2;` if those are unions. But individual members can still store trap representations with respect to their type. There is a footnote on `6.5.2.3/3` that stresses that a trap can occur. – Johannes Schaub - litb May 20 '10 at 18:04
  • @Johannes: can you re-check that section number? In my copy, §6.5.2.3/3 doesn't have any footnotes. Perhaps you intended §6.3.2.3/5? That does have a footnote that talks about trap representations when integers are converted to pointers, which could be relevant here, but I'm not entirely sure it is. It talks about the result of a conversion, not simply looking at the (un-converted) bits as a different target type. – Jerry Coffin May 20 '10 at 18:24
  • @Jerry i'm using the C99 TC3 draft. It's a footnote starting with "If the member used to access...". It's at the first paragraph on the semantics about structure and union member access. – Johannes Schaub - litb May 20 '10 at 18:27
  • @Johannes:Ah hah -- you're quite right. I was going by the original C99 standard (where I'd say it was at least open to question). WRT to C07, however, there seems no room for doubt that it *is* undefined behavior. – Jerry Coffin May 20 '10 at 18:47
1

Union in C facilitate sharing of memory space by different variable.
So when you are changing any variable inside union, all other variable's values are also got affected.

iamabyte
  • 11
  • 2