11

In C++11, I have the following union:

union SomeData
{
    std::uint8_t Byte;
    std::uint16_t Word;
    std::uint32_t DWord;
    unsigned char String[128];
};

If I initialize the union thusly;

SomeData data {};

Is it guaranteed that the entire contents of the union will be "zero'd" out? Put another way; is an empty list-initializer of a union functionally equivalent to memset-ing the union to Zero?:

memset(&data, 0, sizeof(data));

In particular, I'm concerned about the string data. I'd like to ensure the entire length of the string contains zeros. It appears to work in my current compiler, but does the language of the spec guarantee this to always be true?

If not: is there a better way to initialize the full length of the union to zero?

BTownTKD
  • 7,911
  • 2
  • 31
  • 47

2 Answers2

6

No, it is not guaranteed that the entire union will be zeroed out. Only the first declared member of the union, plus any padding, is guaranteed to be zeroed (proof below).

So to ensure the entire memory region of the union object is zeroed, you have these options:

  • Order the members such that the largest member is first and thus the one zeroed out.
  • Use std::memset or equivalent functionality. To prevent accidentally forgetting that, you can of course give SomeData a default constructor which will call this.

Quoting C++11:

8.5.4 [dcl.init.list]/3

List-initialization of an object or reference of type T is defined as follows:

  • If the initializer list has no elements and T is a class type with a default constructor, the object is value-initialized.

8.5 [dcl.init]/7

To value-initialize an object of type T means:

  • if T is a (possibly cv-qualified) class type (Clause 9) with a user-provided constructor (12.1), then the default constructor for T is called (and the initialization is ill-formed if T has no accessible default constructor);
  • if T is a (possibly cv-qualified) non-union class type without a user-provided constructor, then the object is zero-initialized and, if T’s implicitly-declared default constructor is non-trivial, that constructor is called.
  • ...
  • otherwise, the object is zero-initialized.

8.5 [dcl.init]/5:

To zero-initialize an object or reference of type T means:

...

  • if T is a (possibly cv-qualified) union type, the object’s first non-static named data member is zero-initialized and padding is initialized to zero bits;

From these quotes, you can see that using {} to initialise data will cause the object to be value-initialized (since SomeData is a class type with a default constructor).

Value-initializing a union without a user-provided default constructor (which SomeData is) means zero-initializing it.

Finally, zero-initializing a union means zero-initializing its first non-static named data member.

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
Angew is no longer proud of SO
  • 167,307
  • 17
  • 350
  • 455
  • That's a bummer. Is there a simple way to initialize the full union to zero? Or is 'memset' the only way? – BTownTKD Mar 01 '17 at 15:02
  • 3
    @BTownTKD Put the largest member first? – NathanOliver Mar 01 '17 at 15:03
  • Oh. Yeah. Durp. – BTownTKD Mar 01 '17 at 15:14
  • 3
    My understanding of *padding is initialized to zero bits* is that the remaining part of the union will be set to 0. – Serge Ballesta Mar 01 '17 at 15:55
  • @SergeBallesta I tested it on an exemple of a related question, It seems you are right! I edit my answer, now it forward to your answer! – Oliv Mar 01 '17 at 18:18
  • 6
    This is [core issue 694](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#694). The stated intent is to zero out the entire union. – T.C. Mar 01 '17 at 18:18
  • @SergeBallesta I'm not quite sure that the space reserved for inactive members of the union counts as padding. Padding is normally the area *outside* of members, introduced for alignment purposes. – Angew is no longer proud of SO Mar 02 '17 at 07:04
  • @T.C. Unfortunately, "padding" is wonderfully underspecified in the standard (it doesn't even have an index entry!). Are the parts of a union's inactive members which don't overlap the active member really padding? Wouldn't padding be more like extra space at the end for array-alignment purposes? – Angew is no longer proud of SO Mar 02 '17 at 07:10
  • @Angew, Here is a quote of issue 694, that explain why the C commitee added "and padding is zero-initialized": *The C committee is considering changing the definition of zero-initialization of unions to guarantee that **the bytes of the entire union** are set to zero before assigning 0, converted to the appropriate type, to the first member.* But it seems to apply to zero initialization, which is performed only on static object no? – Oliv Mar 02 '17 at 07:45
  • @Oliv If you read the quotes in the A, you'll see that zero-init happens as part of this value-init. And I've read the rationale, which would IMO be realised by the proposed 2008 resolution. However, I cannot be sure that the actually accepted 2010 resolution has the same effect, especially since it's preceded with "The C Committee has changed its approach to this question". It's very possible the intent is to zero-out the entire union, but I don't really see how this wording about padding guarantees that. – Angew is no longer proud of SO Mar 02 '17 at 07:59
  • @Angew 9.5 Unions [class.union] says *Each non-static data member is allocated as if it were the sole member of a struct.*. So at first member initialization time, it should be considered at the sole member and padding should be all bytes after it - But I would not rely too much on all C++ implementers understanding the standard that way ;-) – Serge Ballesta Mar 02 '17 at 08:02
  • @SergeBallesta Good find, it seems that's the intent, then. However, as you say, it's far from unambiguous. – Angew is no longer proud of SO Mar 02 '17 at 08:05
  • @Angew: even it your answer is the accepted one, and if I reallly think that the intent is that all bytes of the union are set to 0, I've added a warning in my answer :-) – Serge Ballesta Mar 02 '17 at 08:16
  • @Angew, @SergeBallesta, @Angew Is the union `SomeData` not an aggregate? This is important because according to the standard, aggregate initialization does not cause zero initialization, no? – Oliv Mar 02 '17 at 10:25
  • @Oliv Doesn't matter. Aggregate initialisation is the **second** bullet point under 8.5.4/3 (and starts with "Otherwise"), while value initialisation is the **first** (the one I quoted). So, since the first bullet point applies, value-init happens and aggregate-init does not. – Angew is no longer proud of SO Mar 02 '17 at 11:04
  • @Angew, You are right, my reference was the C++14 standard, in this last standard, the first bullet only applies to copy initilization. With this change to the standard, is there any risk that compilers remove this zero initialization? – Oliv Mar 02 '17 at 11:23
  • @Oliv Wow, that's quite a change! When I have the time, I will definitely incorporate it into the answer (even though the Q specifies C++11, it's definitely very relevant). However, since the brace-init-list is empty, it effectively means that the first member will be initialised from an empty initializer list (C++14 8.5.1/7), which again means value-init. But even less is then known about padding, aparently. – Angew is no longer proud of SO Mar 02 '17 at 11:30
  • @Angew The final wording is basically identical to WG14 N1387, linked in the issue, and the stated intent of that paper is clearly for this wording to reflect the "byte-flooding" behavior. – T.C. Mar 05 '17 at 20:49
6

The entire union will be zeroed out. More exactly the first member of the union will be default initialized and all the remaining bytes in the union will be set to 0 as padding.

References (emphasize mine):

8.5 Initializers [dcl.init]
...

5 To zero-initialize an object or reference of type T means:
...
— if T is a (possibly cv-qualified) union type, the object’s first non-static named data member is zero initialized and padding is initialized to zero bits;

That means that the first member of the union (here std::uint8_t Byte;) will be initialized to a 0 and that all other bytes in the union will be set to 0 because they are padding bytes.


But beware. As stated by Angew "padding" is wonderfully underspecified in the standard and a C compiler could interpret that the padding bytes in a union are only the bytes that follow the largest member. I would really find that weird because compatibility changes are specifically documented and previous versions (C) first initialized everything to 0 and next did specific initialization. But a new implementer could not be aware of it...

TL/DR: I really think that the intent of the standard is that all bytes in the union are set to 0 in OP's example, but for a mission critical program, I would certainly add an explicit 0 constructor...

Community
  • 1
  • 1
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • Interesting! That would be a happy interpretation indeed. Is there anything in the spec which clarifies the definition of "padding" in a union? – BTownTKD Mar 01 '17 at 17:11
  • Interesting. It does mean, surely, that everyone needs to put the largest member first in a union to get the entire union initialised to zero, right? – SJHowe Mar 02 '17 at 16:47
  • @SJHowe More exactly, if the first member in the union is the greastest, you are sure that all bytes in union will be set to 0. – Serge Ballesta Mar 02 '17 at 18:32