Two arrays in a union in C++

Question

is it possible to share two arrays in a union like this:

struct
    {
        union
        {
            float m_V[Height * Length];
            float m_M[Height] [Length];
        } m_U;
    };

Do these two arrays share the same memory size or is one of them longer?

It's probably not guaranteed by the standard, but in practice this will behave as expected, i.e. the two arrays will be the same size and can be used interchangeably. — Paul R, Jul 09 '12 at 11:09
Small remark: `Height` and `Length` have to be compile-time constants. Otherwise it should be okay. — Jakob S., Jul 09 '12 at 11:13
@PaulR I've actually seen a similar case fail with g++. As long as the accesses are through the union member, g++ recognizes it, but if you pass references to `m_V` and `m_M` to a function, I'm less sure. (It might work, because in the end, all of the accesses are to `double`; in the case I know where it failed, there were different base types involved.) — James Kanze, Jul 09 '12 at 11:23

score 3 · Answer 1 · answered Jul 09 '12 at 11:20

3

Both arrays are required to have the same size and layout. Of course, if you initialize anything using m_V, then all accesses to m_M are undefined behavior; a compiler might, for example, note that nothing in m_V has changed, and return an earlier value, even though you've modifed the element through m_M. I've actually used a compiler which did so, in the distant past. I would avoid accesses where the union isn't visible, say by passing a reference to m_V and a reference to m_M to the same function.

answered Jul 09 '12 at 11:20

James Kanze

150,581
18
184
329

I believe that the standard guarantees the access in this case *because* they are layout compatible. – David Rodríguez - dribeas Jul 09 '12 at 11:28
1

@DavidRodríguez-dribeas In C++11, perhaps. I've not studied it in such detail. C++03 didn't have the concept of layout compatibility. (But a quick glance at §3.10 doesn't show any allowance for layout compatibility. There are some fudges for cv-qualifiers and signed/unsigned, but otherwise, the lvalue used to access the object must have either the same type or `char` or `unsigned char`. – James Kanze Jul 09 '12 at 11:40
C++03 9.5p1 *If a POD-union contains several POD-structs that share a common initial sequence (9.2), and if an object of this POD-union type contains one of the POD-structs, it is permitted to inspect the common initial sequence of any of POD-struct members*. The wording does not mention standard layout types, but it does endorse the usage above. [I am only assuming that an array can be considered a *POD-struct*, I have not found a definition of *POD-struct* in the standard] – David Rodríguez - dribeas Jul 09 '12 at 12:25
2

@David: `float m_V[Height * Length];` and `float m_M[Height] [Length];` are not POD structs. They're POD, but they're arrays not structs. They don't share an initial sequence. AFAIK that rule is to allow `struct Foo {int type; char *data; }; struct Bar {int type; double *data; }; union FooBar { Foo f; Bar b; }`, then you can inspect either of `f.type` or `b.type` no matter which of `f` or `b` was last assigned. The "common initial sequence" is the `int` member. – Steve Jessop Jul 09 '12 at 12:31
1

@DavidRodríguez-dribeas That's the wording I remember. It is taken directly from the C standard, and is present for reasons of C compatibility. And the traditional interpretation in C is that a common initial sequence means just that: exactly the same types (with, I think, the same names). This is how C implemented polymorphism: the common initial sequence (generally implemented as a macro) represented the base class. And I don't see how it can apply here, since the the union doesn't contain any struct's, much less POD structs. – James Kanze Jul 09 '12 at 12:32

score 1 · Answer 2 · answered Jul 09 '12 at 11:10

1

It is implicitly guaranteed that these will be the same size in memory. The compiler is not allowed to insert padding anywhere in either the 2D array or the 1D array, because everything must be compatible with sizeof.

[Of course, if you wrote to m_V and read from m_M (or vice versa), you'd still be type-punning, which technically invokes undefined behaviour. But that's a different matter.]

answered Jul 09 '12 at 11:10

Oliver Charlesworth

267,707
33
569
680

I don't think that would be undefined behavior. The standard is explicit in guaranteeing that access to a member of an union other than the active one is fine if the active member and the accessed member shared a initial compatible layout sequence of members and only those members are accessed. In this case, I believe that the two arrays share an initial sequence of layout compatible sub objects that covers the whole arrays. – David Rodríguez - dribeas Jul 09 '12 at 11:27
1

I can't find it at the moment (although I know its there), but at least in C, the compiler was allowed to assume that an lvalue referring to a `double [][]` did not alias any data in an lvalue referring to a `double []`. – James Kanze Jul 09 '12 at 11:44
@JamesKanze: I think the question becomes whether for strict-aliasing purposes I have used an lvalue of type `double[][]` to "access" the `double[]` object when I write `m_M[0][0] = 0;`. If I have, then it's a clear breach of strict aliasing. But when you break down `m_M[0][0]` I'm not sure that I actually access anything using an lvalue of type `double[][]` or even `double[]` - I do form some lvalue expressions with those types, but they decay to pointers. I'm allowed to access members of `m_V` using an lvalue of type `double` (the type of `m_M[0][0]`), no matter how I got there. – Steve Jessop Jul 09 '12 at 12:39
@SteveJessop My impression is that the original intent was to ban this; the wording in this area is not always very clear, nor necessarily what was intended. In practice, I suspect that most, if not all compilers will treat it as accessing a `double` in both cases, and assume possible aliasing. – James Kanze Jul 09 '12 at 16:25
@James Kanze: I believe that one of the TC to C99 actually "legalized" all forms of aliasing, when they are performed through *unions* specifically. I.e. now you can alias anything to anything through unions, as long as you are not running into trap representations. Trap-representation-related problems are still there, but the compiler is no longer allowed to assume that there's no aliasing there. – AnT stands with Russia Jul 09 '12 at 16:39
When GCC implemented strict-aliasing semantics, they deliberately left unions out of it, as a workaround for those practical situations when you actually want to alias. This approach was eventually accepted into C standard. – AnT stands with Russia Jul 09 '12 at 16:41
@AndreyT When the C standard was being written, the orientation was to support aliasing through casts, rather than unions. In fact, the wording adopted didn't really allow either. I don't have a copy of the most recent C work, so I can't say what directions it has taken since then. I do know that there were cases where g++ wasn't conform, even when unions where used in a fully conforming way (never accessing anything but the last element written). I also think that this was considered a defect in the standard. (But the problem was aliasing, and not just unions.) – James Kanze Jul 09 '12 at 17:28
@AndreyT "_when they are performed through unions_" yes, but "through union" isn't defined in the C standard! – curiousguy Jul 18 '12 at 03:42
@JamesKanze "_I do know that there were cases where g++ wasn't conform, even when unions where used in a fully conforming way (never accessing anything but the last element written)._" I am intrigued... – curiousguy Jul 18 '12 at 03:45
"_you'd still be type-punning_" is it still called type-punning when the **only lvalue read or written** is float? – curiousguy Jul 18 '12 at 03:48
@curiousguy The problem occurs when you have a union with two different types, say `int` and `double`: you assign to the `int`, then call a function passing pointers to each of the elements. In the function, you read the `int` (through the pointer), and then write the `double` (through the other pointer). In some cases, at least g++ will reorder the read and the write in the function. This is not conform with the literal wording in the standard. (It probably is conform with the intent, but the wording doesn't match the intent.) – James Kanze Jul 18 '12 at 07:18
@curiousguy The types `float[a*b]` and `float[a][b]` are different. Accessing one, after having written to the other, is type punning. – James Kanze Jul 18 '12 at 07:19
@JamesKanze "_Accessing one, after having written to the other, is type punning._" You can only read and write `float`, not `float[a*b]` or `float[a][b]`. – curiousguy Jul 18 '12 at 07:58
@curiousguy That's not the way the C or C++ standard view it. Modifying a part of an object is modifying the object. – James Kanze Jul 18 '12 at 08:30
let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/14050/discussion-between-curiousguy-and-james-kanze) – curiousguy Jul 18 '12 at 09:35

Two arrays in a union in C++

2 Answers2