14

Using the return value of operator* from a "dead" unique_ptr is bad.

The following code compiles but results of course in Undefined Behavior:

auto& ref = *std::make_unique<int>(7);
std::cout << ref << std::endl;

Why didn't the standard make the return type of operator* for an rvalue of std::unique_ptr an rvalue of the internal value, instead of an lvalue, like this:

// could have been done inside unique_ptr
T& operator*() & { return *ptr; }
T&& operator*() && { return std::move(*ptr); }

In which case this would work fine:

std::cout << *std::make_unique<int>(7) << std::endl;

But the code at the beginning would not compile (cannot bind an rvalue to an lvalue).


Side note: of course someone could still write bad code like the below, but it is saying "I'm UB" more verbosely, IMHO, thus less relevant for this discussion:

auto&& ref = *std::make_unique<int>(7);
std::cout << ref << std::endl;

Is there any good reason for operator* on an rvalue of std::unique_ptr to return an lvalue ref?

Amir Kirsh
  • 12,564
  • 41
  • 74
  • 2
    Why: Because the standard lacks a r-value ref qualified overload of `operator*`. Why wasn't the ref qualified version added? Perhaps an oversight, maybe a good idea for a paper. – AndyG Jul 24 '19 at 14:40
  • 3
    Of course, it would have to be `T&& operator*() && { return ::std::move(*ptr); }`, otherwise it wouldn't even compile – Angew is no longer proud of SO Jul 24 '19 at 14:43
  • 1
    With your proposed change, would e.g. `auto ptr = std::make_unique(7); auto& ref = *ptr;` still work? – Some programmer dude Jul 24 '19 at 14:48
  • If I understand correctly, this defeats the purpose of having unique_ptr. If you want to get the value now, you go through the unique_ptr. If you want to get the value later, you go through the unique_ptr. – Kenny Ostrom Jul 24 '19 at 14:56
  • 1
    "*it is saying "I'm UB" more verbosely, thus not so relevant:*" I fail to see why. `auto&&` doesn't mean "I'm UB". – Nicol Bolas Jul 24 '19 at 15:01
  • @Someprogrammerdude yes of course `auto ptr = std::make_unique(7); auto& ref = *ptr;` would work: https://coliru.stacked-crooked.com/a/8ae324730da2cce6 – Amir Kirsh Jul 24 '19 at 15:27
  • @NicolBolas it is a more verbose UB because the code says to get back rvalue. – Amir Kirsh Jul 24 '19 at 15:38
  • 1
    @AmirKirsh: My point is that it's not "saying 'I'm UB'" at all. That is, it isn't more *explicitly* or obviously a declaration of UB than the lvalue reference version. – Nicol Bolas Jul 24 '19 at 15:40
  • @Angew sure, edited to use std::move for the rvalue overload, was already in my code but accidentally not copied correctly into the question. https://coliru.stacked-crooked.com/a/8ae324730da2cce6 – Amir Kirsh Jul 24 '19 at 17:38
  • `std::cout << *std::make_unique(7) << std::endl;` works even without adding an `operator*` for rvalues, because the temp `unique_ptr` stays in scope until the `;` is reached, so calling `operator*` on the temp is perfectly legal and well-defined. – Remy Lebeau Jul 24 '19 at 18:18
  • @RemyLebeau right, and there is no problem with that – Amir Kirsh Jul 24 '19 at 18:22
  • Functions do not "return" rvalues or lvalues. Value categories pertain to expressions. However, the value category of a function call expression is, at least in part, determined by the return type of that function. So you'd be better off asking about the return type of that `operator*`. – Asteroids With Wings Nov 07 '20 at 22:27

3 Answers3

11

Your code, in terms of the value categories involved and the basic idea, is the equivalent of this:

auto &ref = *(new int(7));

new int(7) results in a pointer object which is a prvalue expression. Dereferencing that prvalue results in an lvalue expression.

Regardless of whether the pointer object is an rvalue or lvalue, applying * to a pointer will result in an lvalue. That shouldn't change just because the pointer is "smart".

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • So, historical reasons and consistency? – Lightness Races in Orbit Jul 24 '19 at 15:10
  • That works (in the sense that the referred object still exists) because of the memory leak. In the case of a smart pointer, the reference would be dangling and useless. Why shouldn't this change with smart pointers, considering this expression appears to be useless with them? – eerorika Jul 24 '19 at 15:13
  • 1
    This is less of a use case and more of "let's preserve some UB also in smart pointers to keep C++ interesting"... which is less of an argument. I tend to believe that smart pointers were designed to get rid of as many UB of raw pointers as possible, not to preserve them. – Amir Kirsh Jul 24 '19 at 15:21
  • 1
    @AmirKirsh: It's not supposed to be a "use case". It's supposed to answer the question in your title. Why does it do it like this? Because a smart pointer should act like a pointer where possible, and pointers act this way. Nothing more needs to be said. – Nicol Bolas Jul 24 '19 at 15:32
  • 1
    @NicolBolas I defer. There is no sense in preserving bad behavior. The question, at the end is phrased: Is there any good usage of operator* on rvalue of std::unique_ptr, returning lvalue ref? There is a difference between saying: it is there for historical reasons but can be changed; or - there is a good use case for this behavior thus cannot change. – Amir Kirsh Jul 24 '19 at 15:41
  • @AmirKirsh With your solution applied to `unique_ptr` one might ask why it does not hold when `make_shared()` is used? C++ is a broad and complex language and you should not expect all the language features work together seamlessly. – jszpilewski Jul 24 '19 at 15:46
  • 1
    @AmirKirsh: Then your question is irrelevant. Maybe you'll present a proposal to have this changed, but it's not going to be accepted because it's both a needless difference between pointers and smart pointer and that it's not backwards compatible. You need something more than "I think this is bad" to get a non-backwards-compatible change through the committee. That is, it doesn't really matter if there's a "use case" for it or not, because the behavior isn't going anywhere. – Nicol Bolas Jul 24 '19 at 15:56
  • Am I missing something? "*Your code is the equivalent of this (barring the memory leak)*" seems incorrect, since the code in OP's example exhibits undefined behaviour (smart pointer deallocated the resources) while I see no reason for your code to not work. You can still do `delete &ref` and there is no UB, isn't there? – Fureeish Jul 24 '19 at 17:42
  • @Fureeish: It's equivalent in terms of the value category of the expressions involved, and the concept of "I dereference a pointer to an object I just dynamically allocated". That is, the only difference is that the pointer is smart in one case and dumb in another. – Nicol Bolas Jul 24 '19 at 17:57
  • Then I believe it's worth to specify that. Not long ago I answered a question regarding binding a reference to immediately dereferenced `new`. OP was asking whether that's UB. It's not, but I worry that some inexperienced users / programmers may come across this question, see the "*results of course with an Undefined Behavior*", see your answer, notice the "*equivalent*" part and assume it's also UB. – Fureeish Jul 24 '19 at 18:00
4

Good question!

Without digging into the relevant papers and design discussions, I think there are a few points that are maybe the reasons for this design decision:

  1. As @Nicol Bolas mentioned, this is how a built-in (raw) pointer would behave, so "do as int does" is applied here as "do as int* does".

    This is similar to the fact that unique_ptr (and other library types) don't propagate constness (which in turn is why we are adding propagate_const).

  2. What about the following code snippet? It doesn't compile with your suggested change, while it is a valid code that shouldn't be blocked.

class Base { virtual ~Base() = default; };
class Derived : public Base {};
void f(Base&) {}

int main()
{
    f(*std::make_unique<Derived>());
}

(godbolt - it compiles if our operator* overloadings are commented out)


For your side note: I'm not sure auto&& says "I'm UB" any louder. On the contrary, some would argue that auto&& should be our default for many cases (e.g. range-based for loop; it was even suggested to be inserted automatically for "terse-notation range-based for loop" (which wasn't accepted, but still...)). Let's remember that rvalue-ref has similar effect as const &, extension of the lifetime of a temporary (within the known restrictions), so it doesn't necessarily look like a UB in general.

Yehezkel B.
  • 1,140
  • 6
  • 10
2

std::cout << *std::make_unique<int>(7) << std::endl; already works as the temporary dies at the end of the full expression.

T& operator*() & { return *ptr; }
T&& operator*() && { return std::move(*ptr); }

wouldn't avoid the dangling reference, (as for your example)

auto&& ref = *std::make_unique<int>(7); // or const auto&
std::cout << ref << std::endl;

but indeed, would avoid binding a temporary to a non-const lvalue reference.

Another safer alternative would be:

T& operator*() & { return *ptr; }
T operator*() && { return std::move(*ptr); }

to allow the lifetime extension, but that would do an extra move constructor not necessarily wanted in the general case.

Boann
  • 48,794
  • 16
  • 117
  • 146
Jarod42
  • 203,559
  • 14
  • 181
  • 302
  • This safer alternative has the problem I described in my answer that I deleted, because it doesn't apply to returning a reference. The problem is that it introduces an extra copy / move, which is not always an option, in particular when the element type is non-copyable and non-movable. – eerorika Jul 24 '19 at 17:10