Does the standard state that copies must be equivalent?

Question

Suppose I have a weird string type, that either owns or doesn't own it's underlying buffer:

class WeirdString {
private:
    char* buffer;
    size_t length;
    size_t capacity;
    bool owns;

public:
    // Non-owning constructor
    WeirdString(char* buffer, size_t length, size_t capacity)
        : buffer(buffer), length(length), capacity(capacity), owns(false)
    { }

    // Make an owning copy
    WeirdString(WeirdString const& rhs)
        : buffer(new char[rhs.capacity])
        , length(rhs.length)
        , capacity(rhs.capacity)
        , owns(true)
    {
        memcpy(buffer, rhs.buffer, length);
    }

    ~WeirdString() {
        if (owns) delete [] buffer;
    }
};

Does that copy constructor violate the standard somewhere? Consider:

WeirdString get(); // this returns non-owning string
const auto s = WeirdString(get());

s is either owning or non-owning depending on whether or not the additional copy constructor got elided, which in C++14 and earlier is permitted but optional (though in C++17 is guaranteed). That Schrödinger's ownership model suggests that this copy constructor is, in itself, undefined behavior.

Is it?

A more illustrative example might be:

struct X {
    int i;

    X(int i)
      : i(i)
    { }

    X(X const& rhs)
      : i(rhs.i + 1)
    { }        ~~~~
};

X getX();
const auto x = X(getX());

Depending on which copies get elided, x.i could be 0, 1, or 2 more than whatever was returned in getX(). Does the standard say anything about this?

From a C++ perspective, Schrödinger's cat is merely in an unspecified state. You don't get Undefined Behavior merely because you don't know the exact state from a set of well-defined possible states. — MSalters, Jan 23 '17 at 23:08
In `f() + g()`, it is unspecified whether `f` or `g` get called first; this is not, by itself, a reason to declare that the expression exhibits undefined behavior. It's possible, of course, that `g` somehow relies on a side effect produced by `f`, and exhibits undefined behavior in its absence. Yours is a similar situation: copy constructor may or may not be elided, and you may end up with owning or non-owning instance - that by itself does not trigger undefined behavior; but it's possible that something further down relies on the instance being in a particular state, and gets disappointed. — Igor Tandetnik, Jan 23 '17 at 23:17
@IgorTandetnik Yes, but we have explicit wording about how that situation would be undefined behavior. I find it strange that there is seemingly no wording about what a copy constructor is supposed to do. — Barry, Jan 23 '17 at 23:18
What wording do you have in mind, about what situation being undefined behavior? I'm not sure I follow. Anyway, what would you have the standard say about the copy constructor? It would be a challenge to define what it means for two instances of an arbitrary class to be "equivalent"? — Igor Tandetnik, Jan 23 '17 at 23:20
As I recall the rules are changing with C++17, in that elision is required. That will make the X example well-behaved. — Cheers and hth. - Alf, Jan 23 '17 at 23:21
@IgorTandetnik To start with, I'd have expected it to at least specify that the two instances should be equivalent, even if it's handwavy about what that means. — Barry, Jan 23 '17 at 23:28
What good would that be? If the standard cannot state the requirement precisely, then the programmer cannot verify whether their code meets that requirement. In any case, why does the behavior of the copy constructor, specifically, bother you so much? The non-deterministic abstract machine described by the standard is non-deterministic - this can be easily triggered by things other than copy elision. — Igor Tandetnik, Jan 23 '17 at 23:32
@Igor Well, that's what it does for the library. `vector` would be undefined behavior because `X` is not `CopyConstructible`. (assume `X` had a move ctor that similarly did something odd) — Barry, Jan 24 '17 at 00:03
`CopyConstructible` only means that the class provides a copy constructor. It doesn't mandate any particular behavior of said constructor. Both classes you show satisfy `CopyConstructible` requirement, and you can happily have a vector thereof. I'm not sure where you see a source of undefined behavior. — Igor Tandetnik, Jan 24 '17 at 00:32
@Igor No, it requires that the new object be "equivalent" to the old object and that the old object be unchanged. Neither type satisfies equivalence. — Barry, Jan 24 '17 at 00:34
Hmm, so it does. I have no idea what it means though; I can't find where "equivalent" is defined. Therefore, I don't see how one can decide whether two instances of `WeirdString` or `X` are or are not "equivalent" for the purposes of these requirements. I would argue it's a defect in the standard. — Igor Tandetnik, Jan 24 '17 at 00:50
@IgorTandetnik https://timsong-cpp.github.io/lwg-issues/1173 — T.C., Feb 26 '17 at 05:52

score 5 · Accepted Answer · answered Jan 23 '17 at 23:37

Regarding the new question's code

struct X {
    int i;

    X(int i)
      : i(i)
    { }

    X(X const& rhs)
      : i(rhs.i + 1)
    { }        ~~~~
};

X getX();
const auto x = X(getX());

Here the copy constructor doesn't copy, so you're breaking the compiler's assumption that it does.

With C++17 I believe you're guaranteed that it's not invoked in the above example. However I don't have a draft of C++17 at hand.

With C++14 and earlier it's up to the compiler whether the copy constructor is invoked for the call of getX, and whether it's invoked for the copy initialization.

C++14 §12.8/31 _{^{class.copy/31}}:

” When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the constructor selected for the copy/move operation and/or the destructor for the object have side effects.

This is not undefined behavior in the sense of the formal meaning of that term, where it can invoke nasal demons. For the formal terminology I'd choose unspecified behavior, because that's behavior that depends on the implementation and not required to be documented. But as I see it what name one chooses doesn't really matter: what matters is that the standard just says that under the specified conditions a compiler can optimize a copy/move construction, regardless of the side effects of the optimized-away constructor – which you therefore can not and should not rely on.

For C++17, see [P0135](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0135r0.html), which is voted into the WP (the latest is N4618, which you can google). — Columbo, Jan 24 '17 at 10:12

score 4 · Answer 2 · edited May 23 '17 at 12:09

^{The part of the question about a class X was added after this answer. It's fundamentally different in that X copy constructor does not copy. I've therefore answered that separately.}

Regarding the original question's WeirdString: it's your class so the standard places no requirements on it.

However, the standard effectively let compilers assume that a copy constructor copies, and nothing else.

Happily that's what your copy constructor does, but if (I know this doesn't apply to you, but if) it had mainly had some other effect, that you relied on, then the copy elision rules could wreak havoc with your expectations.

Where you'd want a guaranteed owning instance (e.g. in order to pass it to a thread) you can simply provide an unshare member function, or a constructor with a tag argument, or a factory function.

You can generally not rely on a copy constructor being invoked.

To avoid problems you'd better take care of all possible copying, which means also the copy assignment operator, operator=.

Otherwise you risk that two or more instances all think they own the buffer, and are responsible for deallocation.

It's also a good idea to support move semantics by defining a move a constructor and declaring or defining a move assignment operator.

You can be more sure of correctness of all this by using a std::unique_ptr<char[]> to hold the buffer pointer.

Among other things that prevents inadvertent copying via a copy assignment operator.

`std::string` is irrelevant, and the elision of the copy *would* wreak havoc if I needed an owning string. The question is specifically about if there are any requirements placed on the copy constructor for copy elision to be valid. — Barry, Jan 23 '17 at 23:00
Sorry for a slight little binary inversion. Hm. I can see where you'd want a guaranteed owning instance to pass to a thread. One way to do that is to simply provide an `unshare` member function, or a constructor with a tag argument, or a factory function. — Cheers and hth. - Alf, Jan 23 '17 at 23:13
Do you mind just deleting everything after the first hr? It's unrelated to the question and distracting. The question isn't about how to properly implement a string, it's about the implications of having a not-quite-copy constructor (which the first part of your answer addresses). — Barry, Jan 24 '17 at 00:23
@Barry: OK, I guessed wrong about what you were doing this for. So, deleting the middle section. I think the last one, about taking charge of copying in general, is relevant still; isn't it? — Cheers and hth. - Alf, Jan 24 '17 at 01:14

Does the standard state that copies must be equivalent?

2 Answers2