2

I have a worker thread giving my rendering thread a std::shared_ptr<Bitmap> as part of how I download texture data from the GPU. Both threads depend on std::shared_ptr<...>::unique() to determine if the other thread is done with the Bitmap. No other copies of this shared pointer are in play.

Here's a trimmed down chart of the behavior:

          worker thread             |            rendering thread
----------------------------------- + -------------------------------------------------
std::shared_ptr<Bitmap> bitmap;     |  std::queue<std::shared_ptr<Bitmap>> copy;
{                                   |  {
    std::scoped_lock lock(mutex);   |     std::scoped_lock lock(mutex);
    queue.push_back(bitmap);        |     std::swap(queue, copy);
}                                   |  }
...                                 |  for(auto it = copy.begin(); it != copy.end(); ++it) 
                                    |     if (it->unique() == false)
if (bitmap.unique())                |         fillBitmap(*it); // writing to the bitmap
    bitmap->saveToDisk(); // reading       

Is it safe to use unique() or use_count() == 1 to accomplish this?

edit:

Ah... since unique() is unsafe in this case, I think instead I'm going to try something like std::shared_ptr<std::pair<Bitmap, std::atomic<bool>>> bitmap; instead, flipping the atomic when it's filled.

Anne Quinn
  • 12,609
  • 8
  • 54
  • 101
  • 1
    No it is not. See the note here about the deprecation of `unique`: https://en.cppreference.com/w/cpp/memory/shared_ptr/unique. – wohlstad Jun 18 '22 at 14:41
  • `use_count` is also problematic. from the link I posted above: "use_count is only an approximation in multithreaded environment". – wohlstad Jun 18 '22 at 14:42
  • See also: [Checking for sole ownership of shared_ptr](https://stackoverflow.com/questions/28118904/checking-for-sole-ownership-of-shared-ptr) – rustyx Jun 18 '22 at 14:47
  • @wohlstad - That's disappointing... I hope they consider deprecating use_count() too if it's only approximate. Or at least guarantee it's accuracy in the absence of weak_ptr usage. – Anne Quinn Jun 18 '22 at 14:50
  • The worker thread should have an input queue and an output queue. Add the bitmap to the first like you do now but then have the worker thread add the bitmap to the output after `fillBitmap` has finished. Both queues need a mutex for protection and as synchronization of the bitmap data. – Goswin von Brederlow Jun 18 '22 at 16:03

1 Answers1

4

No, it is not safe. A call to unique (or to use_count) does not imply any synchronization.

If e.g. the worker thread observes bitmap.unique() as true, this does not imply any memory ordering on the accesses made by fillBitmap(*it); and those made by bitmap->saveToDisk();. The function call may be implemented as a relaxed load, which doesn't have the required acquire-release effect.

Without any memory ordering, you will likely have a data race and undefined behavior as consequence.

This is why std::shared_ptr::unique was deprecated with C++17 and removed with C++20, since it is so easy to misuse in this way. Just using use_count() == 1 doesn't work either. It has the exact same issues. I guess it wasn't removed as well, because there may be valid uses cases even in multi-threaded environments for use_count if you only need it to get a fuzzy idea of the number of current users and because there should still be a way of implementing unique for a single-threaded context (where it doesn't have any problems).

According to the proposal to remove std::shared_ptr::unique, many implementations do in fact use a relaxed load, implying that problems may actually happen on real implementations, see https://open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0521r0.html.


If you are on an architecture like x86 you may be saved by the fact that on the hardware level a relaxed load is automatically an acquire load.

However, the C++ memory model does not make any ordering guarantees here and I think (might be wrong here) for example ARM is an example for an architecture where normal (relaxed) loads don't automatically have acquire/release semantics.

And in any case, because there are no memory ordering guarantees, I think the compiler could still perform transformations that will cause issues in any case. For example, because unqiue doesn't imply any synchronization, the compiler could load data used in bitmap->saveToDisk(); before actually loading the reference counter. The load may be unnecessary if the condition on the reference counter turns out to not be fulfilled, but I don't see anything preventing the compiler from adding a superfluous load.

Inbetween these loads some stores from fillBitmap(*it); and the destruction of the rendering threads std::shared_ptr could happen, causing the worker thread to use data from before the stores in fillBitmap(*it);.

user17732522
  • 53,019
  • 2
  • 56
  • 105
  • If `bitmap.unique()` then the bitmap is no longer present in `queue` or `copy` so `fillBitmap` can not be called on it. `fillBitmap` can only be called when the refcount is already 2. – Goswin von Brederlow Jun 18 '22 at 14:54
  • 1
    @GoswinvonBrederlow Yes, but it might have been called on it beforehand and the shared_ptr instance in the queue destroyed before the `unique` call in the worker thread as well. However, because there is no synchronization guarantee and the `unique` call may be using a relaxed load, there is no implied memory ordering outside the reference counter. For that an acquire load would be necessary and a synchronization guarantee in the standard to force implementations to use that. – user17732522 Jun 18 '22 at 14:58
  • I don't think it can. No mater how you construct the execution order the `saveToDisk` can only run after `fillBitmap`. The danger is that `saveToDisk` may not be run even though `fillBitmap` has finished and `copy` was destroyed. Similar `fillBitmap` may be run even though `bitmap` has gone out of scope and there is nobody waiting to save the result. I don't quite get the `unique` call in the renderer. If you don't want it rendered then don't put it in the queue. – Goswin von Brederlow Jun 18 '22 at 15:38
  • @GoswinvonBrederlow I just edited in an example of what kind of issues I could see practically. The first half of my answer is based purely on the guarantees made by the standard. – user17732522 Jun 18 '22 at 15:42
  • You are talking about the compiler reordering loads and store across the call to `unique` within the same thread. And what actually has the problem of being a race condition is that reordered load or store. I agree that the compiler could reorder some load from `saveToDisk` to before the call to `unique` and then it might read data before `fillBitmap` has finished. I don't think stores in `fillBitmap` can be reordered past the `shared_ptr` destructor since the ref count decrement includes a barrier, right? – Goswin von Brederlow Jun 18 '22 at 15:53
  • @GoswinvonBrederlow Practically speaking yes, I think the compiler cannot reorder a store from before the ref count decrement to after it. But a relaxed load in the `unique` call doesn't have the analogues effect. – user17732522 Jun 18 '22 at 15:56