4

Puzzled by Guillaume Racicot's comment here. Is there a problem with common initial sequence here or not? At least with GCC 10.1 x86-64 (-O3 --std=c++20 -pedantic -Wall -Werror) I write .words and read .bytes.

sizeof(MyUnion)==32 is reassuring, too.

union MyUnion {
    static constexpr std::size_t size = 32;

    using byte = std::uint8_t;
    using word = std::uint32_t;
    
    std::array<byte, size> bytes;
    std::array<word, size / sizeof(word)> words;
};
static_assert(sizeof(MyUnion)==32);
Xpector
  • 639
  • 1
  • 5
  • 17

1 Answers1

3

Standard says:

[array.overview]

... An array is a contiguous container. ...

An array is an aggregate that can be list-initialized with up to N elements whose types are convertible to T.

array<T, N> is a structural type if T is a structural type.

The standard doesn't explicitly say what members std::array has. As such, we technically cannot assume that it has a common initial sequence with any type.

From the shown requirements placed on std::array we might reasonably assume that it has a member of type T[N]. Let's explore whether there is a common initial sequence if this assumption is correct.

[class.mem.general]

The common initial sequence of two standard-layout struct ([class.prop]) types is the longest sequence of non-static data members and bit-fields in declaration order, starting with the first such entity in each of the structs, such that corresponding entities have layout-compatible types, ...

[basic.types.general]

Two types cv1 T1 and cv2 T2 are layout-compatible types if T1 and T2 are the same type, layout-compatible enumerations, or layout-compatible standard-layout class types.

std::uint8_t[32] and std::uint32_t[8] are not the same type (ignoring cv qualifiers), nor are they enumerations nor classes. Therefore they are not layout-compatible types, and therefore they cannot be part of the same common initial sequence.

Conclusion: No, there is no common initial sequence whether we can safely assume the member of std::array or not.


I write .words and read .bytes

The behaviour of the program is undefined.

Given that you want to read it as an array of (unsigned) char, it would be safe to reinterpret instead of union punning:

static constexpr std::size_t size = 32;
using word = std::uint32_t;
std::array<word, size / sizeof(word)> words {
    1, 2, 3, 4,
};
std::uint8_t* bytes = reinterpret_cast<std::uint8_t*>(words.data());

And, if you want a range:

std::span<std::uint8_t, size> bytes_range {
    bytes, bytes + size,
};
eerorika
  • 232,697
  • 12
  • 197
  • 326
  • GCC explicitly [allows type punning via a union’s inactive member](https://gcc.gnu.org/onlinedocs/gcc/Structures-unions-enumerations-and-bit-fields-implementation.html). Since OP mentions GCC the program isn’t undefined in this case. – besc Feb 07 '21 at 20:59
  • @besc I've written the answer from perspective of the C++ language. If a language implementation chooses do define a behaviour for UB, it is free to do so - that would then be a language extension. Your link appears to lead to GCC documentation about its C implementation though. Does it apply to the C++ implementation as well? – eerorika Feb 07 '21 at 21:06
  • Can [`std::bit_cast`](https://en.cppreference.com/w/cpp/numeric/bit_cast#Notes) ([P0476R2](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0476r2.html)) be considered as a superior alternative to `reinterpret_cast`, for this particular use case? (not available yet). – Xpector Feb 08 '21 at 10:35
  • @Xpector It's both superior and inferior depending on use case. It creates a new object instead of interpreting an existing object. But it can be used in many more cases while reinterpretation is allowed only in very limited exceptional cases. – eerorika Feb 08 '21 at 10:38
  • @eerorika From the C++ language perspective your answer is perfectly fine, no doubt about that. I just thought GCC’s special behaviour would be a useful thing to add since the OP mentions GCC specifically. The rule *does* apply to C++ as well. The C++ docs link to that place in the C docs. I can’t seem to find that place in the C++ docs at the moment, but I read it recently. – besc Feb 08 '21 at 16:44