6

I have the following function template:

template <class MostDerived, class HeldAs>
HeldAs* duplicate(MostDerived *original, HeldAs *held)
{
  // error checking omitted for brevity
  MostDerived *copy = new MostDerived(*original);
  std::uintptr_t distance = reinterpret_cast<std::uintptr_t>(held) - reinterpret_cast<std::uintptr_t>(original);
  HeldAs *copyHeld = reinterpret_cast<HeldAs*>(reinterpret_cast<std::uintptr_t>(copy) + distance);
  return copyHeld;
}

The purpose is to duplicate an object of a particular type and return it "held" by the same subobject as the input. Note that in principle, HeldAs can be an ambiguous or inaccessible base class of MostDerived, so no cast can help here.

This is my code, but it can be used with types outside my control (i.e. I cannot modify MostDerived or HeldAs). The function has the following preconditions:

  • *original is of dynamic type MostDerived
  • HeldAs is MostDerived or a direct or indirect base class of MostDerived (ignoring cv-qualifiation)
  • *held refers to *original or one of its base class subobjects.

Let's assume the preconditions are satisifed. Does duplicate have defined behaviour in such case?

C++11 [expr.reinterpret.cast] says (bold emphasis mine):

4 A pointer can be explicitly converted to any integral type large enough to hold it. The mapping function is implementation-defined. [ Note: It is intended to be unsurprising to those who know the addressing structure of the underlying machine. —end note ] ...

5 A value of integral type or enumeration type can be explicitly converted to a pointer. A pointer converted to an integer of sufficient size (if any such exists on the implementation) and back to the same pointer type will have its original value; mappings between pointers and integers are otherwise implementation-defined. [ Note: Except as described in 3.7.4.3, the result of such a conversion will not be a safely-derived pointer value. —end note ]

OK, let's say my compiler is GCC (or Clang, since that uses GCC's definitions of implementation-defined behaviour). Quoting GCC docs chapter 5 on C++ implementation-defined behaviour:

... Some choices are documented in the corresponding document for the C language. See C Implementation. ...

On to chapter 4.7 (C implementation, arrays and pointers):

The result of converting a pointer to an integer or vice versa (C90 6.3.4, C99 and C11 6.3.2.3).

A cast from pointer to integer discards most-significant bits if the pointer representation is larger than the integer type, sign-extends if the pointer representation is smaller than the integer type, otherwise the bits are unchanged.

A cast from integer to pointer discards most-significant bits if the pointer representation is smaller than the integer type, extends according to the signedness of the integer type if the pointer representation is larger than the integer type, otherwise the bits are unchanged.

So far, so good. It would seem that since I'm using std::uintptr_t which is guaranteed to be large enough for any pointer, and since I'm dealing with the same types, copyHeld should point to the same HeldAs subobject of *copy as held was pointing to within *original.

Unfortunately, there's one more paragraph in the GCC docs:

When casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined. That is, one may not use integer arithmetic to avoid the undefined behavior of pointer arithmetic as proscribed in C99 and C11 6.5.6/8.

Wham. So now it seems that even though the value of copyHeld is computed in accordance with the rules of the first two paragraphs, the third one still sends this into Undefined-Behaviour land.

I basically have three questions:

  1. Is my reading correct and the behavior of duplicate undefined?

  2. Which kind of Undefined Behaviour is this? The "formally undefined, but will do what you want anyway" kind, or the "expect random crashses and/or spontaneous self-immolation" one?

  3. If it's really Undefined, is there a way to do such a thing in a well-defined (possibly compiler-dependent) way?

While my question is limited to GCC (and Clang) behaviour as far as compilers are concerned, I'd welcome an answer which considers all kinds of HW platforms, from common desktops to exotic ones.

Angew is no longer proud of SO
  • 167,307
  • 17
  • 350
  • 455
  • 1
    What this means is that you must not assume or try to use `p + n == PTR(INT(p) + n * sizeof(*p))`. – Kerrek SB Aug 12 '14 at 13:47
  • If there is a `virtual` somewhere in the inheritance chain from `MostDerived` to `HeldAs` I am afraid you could be in for a world of hurt. In the Itanium ABI it would work, I think, however the C++ Standard places no restriction on object layout. – Matthieu M. Aug 12 '14 at 13:56
  • @MatthieuM. Yes, there could be virtual inheritance involved. And I know (most) layout is not defined by the standard, but I would assume any sensible implementation would use the *same* layout for all (most-derived) objects of a particular type. Or is there a valid reason to do otherwise? – Angew is no longer proud of SO Aug 12 '14 at 13:58
  • @Angew: I cannot think of any off-hand, in the Itanium ABI it should be okay, in the MSVC ABI I don't know so you might want to check. – Matthieu M. Aug 12 '14 at 14:00
  • why are you hellbent on casting to integral types anyway? subtracting void-pointers will give ptrdiff_t. – sp2danny Aug 12 '14 at 16:11
  • 1
    @sp2danny You can't subtract `void*`s at all. And subtracting object pointers which don't point at elements of (or 1 past) the same array is Undefined Behaviour according to the standard itself. Whereas pointer/integer casts are implementation-defined. – Angew is no longer proud of SO Aug 12 '14 at 16:20
  • I was pretty sure that gave the same result as char*. Was that changed, or was I just mistaken? – sp2danny Aug 12 '14 at 16:22
  • @sp2danny They must both be "pointers to cv-qualified or cv-unqualified versions of the same completely-defined object type." `void` is an incomplete type which cannot be completed. – Angew is no longer proud of SO Aug 12 '14 at 16:32

1 Answers1

0

The usual pattern for this is to put a clone() in the base class.
Then each derived class can implements its own version of clone.

class Base
{
     public:
        virtual Base*  clone() = 0;
};

class D: public Base
{
        virtual Base*  clone(){  return new D(*this);}
};
Martin York
  • 257,169
  • 86
  • 333
  • 562
  • 1
    why not `virtual D*` instead of `virtual Base*` in `D::clone()`? Shouldn't it work, as you have a covariant type? – vsoftco Aug 12 '14 at 23:43
  • @vsoftco: Because that really does not help. If you need the `clone()` method you are already calling from the base class (so you will be getting a `Base*` already no matter what the derived type declares). – Martin York Aug 12 '14 at 23:52
  • ahh ok, I was thinking of something like `Derived foo; Derived* d = foo.clone();`, so you don't have to downcast – vsoftco Aug 12 '14 at 23:56
  • If you know that `foo` is of type `Derived` then you can just use the copy constructor. `Derived* d = new Derived(foo);` – Martin York Aug 13 '14 at 04:37
  • Yes, I know. Unfortunately, the classes involved are totally outside my control. They might even not be polymorphic. – Angew is no longer proud of SO Aug 13 '14 at 07:08