19

Consider the following pair of mutually referencing types:

struct A;
struct B { A& a; };
struct A { B& b; };

This can be initialized with aggregate initialization in GCC, Clang, Intel, MSVC, but not SunPro which insists that user-defined ctors are required.

struct {A first; B second;} pair = {pair.second, pair.first};

Is this initialization legal?

slightly more elaborate demo: http://ideone.com/P4XFw

Now, heeding Sun's warning, what about classes with user-defined constructors? The following works in GCC, clang, Intel, SunPro, and MSVC, but is it legal?

struct A;
struct B { A& ref; B(A& a) : ref(a) {} };
struct A { B& ref; A(B& b) : ref(b) {} };

struct {B first; A second;} pair = {pair.second, pair.first};

demo: http://ideone.com/QQEpA

And finally, what if the container is not trivial either, e.g. (works in G++, Intel, Clang (with warnings), but not MSVC ("pair" unknown in initializer) or SunPro ("pair is not a structure")

std::pair<A, B> pair(pair.second, pair.first);

From what I can see, §3.8[basic.life]/6 forbids access to a non-static data member before lifetime begins, but is lvalue evaluation of pair.second "access" to second? If it is, then are all three initializations illegal? Also, §8.3.2[dcl.ref]/5 says "reference shall be initialized to refer to a valid object" which probably makes all three illegal as well, but perhaps I'm missing something and the compilers accept this for a reason.

PS: I realize these classes are not practical in any way, hence the language-lawyer tag. Related and marginally more practical old discussion here: Circular reference in C++ without pointers

Community
  • 1
  • 1
Cubbi
  • 46,567
  • 13
  • 103
  • 169
  • 2
    My gut feeling is that your aggregate constructions are correct (though I can't prove it just now), while the non-aggregate version with `std::pair` is certainly not allowed for the reasons you state. You can ask a simpler question: `struct Foo { Foo & r; Foo(Foo & f) : r(f) { } } x(x);` – Kerrek SB Jan 03 '12 at 23:21
  • Can you get it to work with pointers? – Karl Knechtel Jan 03 '12 at 23:54
  • The second example is not legal as far as I can tell, as only aggregates (which may not have user-declared constructors) can have brace-initializers: `§8.5[dcl.init]/14` – Emil Styrke Jan 04 '12 at 19:19
  • Related: [Passing `this` before base constructors are done: UB or just dangerous?](http://stackoverflow.com/questions/8126713) (I believe, it almost answers your question with "It's not possible") – bitmask Jan 07 '12 at 21:02

2 Answers2

1

This one was warping my mind at first but I think I got it now. As per 12.6.2.5 of 1998 Standard, C++ guarantees that data members are initialized in the order they are declared in the class, and that the constructor body is executed after all members have been initialized. This means that the expression

struct A;
struct B { A& a; };
struct A { B& b; };
struct {A first; B second;} pair = {pair.second, pair.first};

makes sense since pair is an auto (local, stack) variable, so its relative address and address of members are known to the compiler, AND there are no constructors for first and second.

Why the two conditions mean the code above makes sense: when first, of type A, is constructed (before any other data member of pair), first's data member b is set to reference pair.second, the address of which is known to the compiler because it is a stack variable (space already exists for it in the program, AFAIU). Note that pair.second as an object, ie memory segment, has not been initialized (contains garbage), but that doesn't change the fact that the address of that garbage is known at compile time and can be used to set references. Since A has no constructor, it can't attempt to do anything with b, so behavior is well defined. Once first has been initialized, it is the turn of second, and same: its data member a references pair.first, which is of type A, and pair.first address is known by compiler.

If the addresses were not known by compiler (say because using heap memory via new operator), there should be compile error, or if not, undefined behavior. Though judicious use of the placement new operator might allow it to work, since then again the addresses of both first and second could be known by the time first is initialized.

Now for the variation:

struct A;
struct B { A& ref; B(A& a) : ref(a) {} };
struct A { B& ref; A(B& b) : ref(b) {} };
struct {B first; A second;} pair = {pair.second, pair.first};

The only difference from first code example is that B constructor is explicitly defined, but the assembly code is surely identical as there is no code in the constructor bodies. So if first code sample works, the second should too.

HOWEVER, if there is code in the constructor body of B, which is getting a reference to something (pair.second) that hasn't been initialized yet (but for which address is defined and known), and that code uses a, well clearly you're looking for trouble. If you're lucky you'll get a crash, but writing to a will probably fail silently as the values get later overwritten when A constructor is eventually called. of

Oliver
  • 27,510
  • 9
  • 72
  • 103
1

From compiler point of view references are nothing else but const pointers. Rewrite your example with pointers and it becomes clear how and why it works:

struct A;
struct B { A* a; };
struct A { B* b; };
struct {A first; B second;} pair = {&(pair.second), &(pair.first)}; //parentheses for clarity

As Schollii wrote: memory is allocated beforehand, thus addressable. There is no access nor evaluation because of references/pointers. That's merely taking addresses of "second" and "first", simple pointer arithmetics.

I could rant about how using references in any place other than operator is language abuse, but I think this example highlights the issue well enough :)

(From now on I write all the ctors manually. Your compiler may or may not do this automagically for you.) Try using new:

struct A;
struct B { A& a; B(A& arg):a(arg){;} };
struct A { B& b; A(B& arg):b(arg){;} };
typedef struct PAIR{A first; B second; PAIR(B& argB, A& argA):first(argB),second(argA){;}} *PPAIR, *const CPPAIR;
PPAIR pPair = NULL;// just to clean garbage or 0xCDCD
pPair = new PAIR(pPair->second, pPair->first);

Now it depends on order of execution. If assignment is made last (after ctor) the second.p will point to 0x0000 and first.ref to e.g. 0x0004.
Actually, http://codepad.org/yp911ug6 here it's the ctors which are run last (makes most sense!), therefore everything works (even though it appears it shouldn't).

Can't speak about templates, though.

But your question was "Is that legal?". No law forbids it.
Will it work? Well, I don't trust compiler makers enough to make any statements about that.

Agent_L
  • 4,960
  • 28
  • 30