Trying to control multithreaded access to array using std::atomic

Question

I'm trying to control multithreaded access to a vector of data which is fixed in size, so threads will wait until their current position in it has been filled before trying to use it, or will fill it themselves if no-one else has yet. (But ensure no-one is waiting around if their position is already filled, or no-one has done it yet)

However, I am struggling to understand a good way to do this, especially involving std::atomic. I'm just not very familiar with C++ multithreading concepts aside from basic std::thread usage.

Here is a very rough example of the problem:

class myClass
{
  struct Data
  {
    int res1;
  };

  std::vector<Data*> myData;

  int foo(unsigned long position)
  {
    if (!myData[position])
      {
        bar(myData[position]);
      }

    // Do something with the data
    return 5 * myData[position]->res1;
  }

  void bar(Data* &data)
  {
    data = new Data;

    // Do a whole bunch of calculations and so-on here

    data->res1 = 42;
  }
};

Now imagine if foo() is being called multi-threaded, and multiple threads may (or may not) have the same position at once. If that happens, there's a chance that a thread may (between when the Data was created and when bar() is finished, try to actually use the data.

So, what are the options?

1: Make a std::mutex for every position in myData. What if there are 10,000 elements in myData? That's 10,000 std::mutexes, not great.

2: Put a lock_guard around it like this:

std::mutex myMutex;

{
    const std::lock_guard<std::mutex> lock(myMutex);

    if (!myData[position])
      {
        bar(myData[position]);
      }
}

While this works, it also means if different threads are working in different positions, they wait needlessly, wasting all of the threading advantage.

3: Use a vector of chars and a spinlock as a poor man's mutex? Here's what that might look like:

static std::vector<char> positionInProgress;
static std::vector<char> positionComplete;

class myClass
{
  struct Data
  {
    int res1;
  };

  std::vector<Data*> myData;

  int foo(unsigned long position)
  {   
    if (positionInProgress[position])
      {
        while (positionInProgress[position])
          {
            ;  // do nothing, just wait until it is done
          }
      }
    else
      {
        if (!positionComplete[position])
          {
            // Fill the data and prevent anyone from using it until it is complete
            positionInProgress[position] = true;
            bar(myData[position]);
            positionInProgress[position] = false;

            positionComplete[position] = true;
          }
      }

    // Do something with the data
    return 5 * myData[position]->res1;
  }

  void bar(Data* data)
  {
    data = new Data;

    // Do a whole bunch of calculations and so-on here

    data->res1 = 42;
  }
};

This seems to work, but none of the test or set operations are atomic, so I have a feeling I'm just getting lucky.

4: What about std::atomic and std::atomic_flag? Well, there are a few problems.

std::atomic_flag doesn't have a way to test without setting in C++11...which makes this kind of difficult.
std::atomic is not movable or copy-constructable, so I cannot make a vector of them (I do not know the number of positions during construction of myClass)

Conclusion:

This is the simplest example that (likely) compiles I can think of that demonstrates my real problem. In reality, myData is a 2-dimensional vector implemented using a special hand-rolled solution, Data itself is a vector of pointers to more complex data types, the data isn't simply returned, etc. This is the best I could come up with.

Unrelated: Just in case this isn't the result of an oversimplification, `void bar(Data* data)` leaks. `data` is a local variable, so `data = new Data;` allocates storage and assigns it to a pointer that's going out of scope at the end of the function. — user4581301, Feb 14 '20 at 22:32
@user4581301 It's an extremely simple example I just put up to get my point across. I'm sure I missed more than just that. — Tyler Shellberg, Feb 14 '20 at 22:35
@Sneftel Could you go into a bit more depth? I haven't used condition variables before. Does that imply I need one std::condition_variable, *and* one ```std::mutex``` per position? — Tyler Shellberg, Feb 14 '20 at 22:36
It's seems that you are looking for `lock-free` algorithm for a vector. If I would want to do something like that, I would search in google for `lock-free` vector algorithm, and if it just not there (not likely), I would create `class lock_free_vector : public std::vector<..>` and use mutex for every cell (unless you know for sure that no more than 2 threads will try to access the same cell at the same time, and then I would use atomic bool). — Coral Kashri, Feb 14 '20 at 22:41
How bad is it to create two Data objects for the same spot, and then throw one away? If that's okay, you can use a compare-exchange. — Raymond Chen, Feb 14 '20 at 22:51
@RaymondChen They would come out identical. However, if one thread starts ```bar()``` just as another is about to finish it, then we've waited effectively 2x as long for the same position to be done. Also, we'd need to wait on whomever is slower so we can wait on them all to finish for that position and move on afterwards. ```bar()``` is not trivial in the real code, it's expensive. — Tyler Shellberg, Feb 14 '20 at 22:53
The idea is that the first one to finish bar sets the value with compare-exchange, and then proceeds. The second thread eventually finishes its bar and realizes that the work is already done so it destroys the Data and uses the one from the previous thread. No thread waits more than one bar, but you do create an extra Data and then immediately destroy it. — Raymond Chen, Feb 14 '20 at 22:59
@user4581301 and Tyler: Making your example less nonsense when possible is a good idea, though. I fixed that for you by taking `Data* &data` by reference so you can assign `new Data` to the caller's pointer. — Peter Cordes, Feb 15 '20 at 06:12
3. That's not a valid spinlock: `if(!owned) owned=true` isn't an atomic RMW so two threads could both enter the critical section. And `vector` isn't even atomic so you'd have data-race UB (e.g. your spin-wait would optimize to an infinite loop that doesn't re-read memory). If you were going to roll your own, you'd use `vector>` or something. But that's pretty bad, might as well just use a vector of atomic pointers, maybe with some non-NULL sentinel value. — Peter Cordes, Feb 15 '20 at 06:15

Chris Dodd · Answer 1 · 2020-02-14T23:35:45.053

The biggest problem you're likely to have is that a vector itself is not thread-safe, so you can't do ANY operation that might chage the vector (invalidate references to elements of the vector) while another thread might be accessing it, such as resize or push_back. However, if you vector is effectively "fixed" (you set the size prior to ever spawning threads and thereafter only ever access elements using at or operator[] and never ever modify the vector itself), you can get away with using a vector of atomic objects. In this case you could have:

std::vector<std::atomic<Data*>> myData;

and your code to setup and use an element could look like:

if (!myData[position]) {
    Data *tmp = new Data;
    if (!mydata[position].compare_exchange_strong(nullptr, tmp)) {
        // some other thread did the setup
        delete tmp; } }
myData[position]->bar();

Of course you still need to make sure that the operations done on members of Data in bar are themselves threadsafe, as you can get mulitple threads calling bar on the same Data instance here.

I should have specified, the vector is indeed fixed size prior to any threading. As for the rest, this seems helpful, but I'll need to take a closer look at it next week. Thank you. — Tyler Shellberg, Feb 14 '20 at 23:48
Yup, I was thinking something like that too. Note that `atomic` isn't copy-constructible so you need to set the vector's size in a constructor, not with .resize. Or you could new/delete an array yourself instead of using vector. — Peter Cordes, Feb 15 '20 at 06:19

Trying to control multithreaded access to array using std::atomic

1 Answers1