How to search through next available thread to do computation

Question

I am doing multithreading in C++. This may be something very standard but I can't seem to find it anywhere or know any key terms to search for it online.

I want to do some sort of computation many times but with multiple threads. For each iteration of computation, I want to find the next available thread that has finished its previous computation to do the next iteration. I don't want to cycle through the threads in order since the next thread to be called may not have finished its work yet.

E.g. Suppose I have a vector of int and I want to sum up the total with 5 threads. I have the to-be-updated total sum stored somewhere and the count for which element I am currently up to. Each thread looks at the count to see the next position and then takes that vector value and adds it to the total sum so far. Then it goes back to look for the count to do the next iteration. So for each iteration, the count increments then looks for the next available thread (maybe one already waiting for count; or maybe they are all busy still working) to do the next iteration. We do not increase the number of threads but I want to be able to somehow search through all the 5 threads for the first one that finish to do the next computation.

How would I go about coding this. Every way I know of involves doing a loop through the threads such that I can't check for the next available one which may be out of order.

For the record, summing a `vector` is a terrible case for coordinating tasks through worker threads that eagerly pull from a common set of values; the amount of work to do is tiny, and the cost of synchronizing to ensure each value it counted only once is high. Partitioning the data up front makes way more sense here, as it removes the need for synchronization (aside from waiting for all threads to finish before combining their results), and makes the data access pattern for each thread predictable (good for any memory system prefetch heuristics). — ShadowRanger, Feb 17 '17 at 03:48

score 0 · Answer 1 · answered Feb 16 '17 at 22:13

Use semafor (or mutex, always mix up those two) on a global variable telling you what is next. The semafor will lock the other threads out as long as you access the variable making that threads access clear.

So, assuming you have an Array of X elements. And a global called nextfree witch is initalized to 0, then a psudo code would look like this:

while (1)
{
    <lock semafor INT>
    if (nextfree>=X)
    {
        <release semnafor INT>
        <exit and terminate thread>
    }
    <Get the data based on "nextfree">
    nextfree++;
    <release semafor INT>

    <do your stuff withe the chunk you got>
}

The point here is that each thread will be alone and have exlusive access to the data struct within the semafor lock and therefore can access the next available regardless of what the others doing. (The other threads will have to wait in line if they are done while another thread working on getting next data chunk. When you release only ONE that stands in queue will get access. The rest will have to wait.)

There are some things to be ware of. Semafor's might lock your system if you manage to exit in the wrong position (Withour releasing it) or create a deadlock.

score 0 · Answer 2 · answered Feb 17 '17 at 03:42

This is a thread pool:

template<class T>
struct threaded_queue {
  using lock = std::unique_lock<std::mutex>;
  void push_back( T t ) {
    {
      lock l(m);
      data.push_back(std::move(t));
    }
    cv.notify_one();
  }
  boost::optional<T> pop_front() {
    lock l(m);
    cv.wait(l, [this]{ return abort || !data.empty(); } );
    if (abort) return {};
    auto r = std::move(data.back());
    data.pop_back();
    return std::move(r);
  }
  void terminate() {
    {
      lock l(m);
      abort = true;
      data.clear();
    }
    cv.notify_all();
  }
  ~threaded_queue()
  {
    terminate();
  }
private:
  std::mutex m;
  std::deque<T> data;
  std::condition_variable cv;
  bool abort = false;
};
struct thread_pool {
  thread_pool( std::size_t n = 1 ) { start_thread(n); }
  thread_pool( thread_pool&& ) = delete;
  thread_pool& operator=( thread_pool&& ) = delete;
  ~thread_pool() = default; // or `{ terminate(); }` if you want to abandon some tasks
  template<class F, class R=std::result_of_t<F&()>>
  std::future<R> queue_task( F task ) {
    std::packaged_task<R()> p(std::move(task));
    auto r = p.get_future();
    tasks.push_back( std::move(p) );
    return r;
  }
  template<class F, class R=std::result_of_t<F&()>>
  std::future<R> run_task( F task ) {
    if (threads_active() >= total_threads()) {
      start_thread();
    }
    return queue_task( std::move(task) );
  }
  void terminate() {
    tasks.terminate();
  }
  std::size_t threads_active() const {
    return active;
  }
  std::size_t total_threads() const {
    return threads.size();
  }
  void clear_threads() {
    terminate();
    threads.clear();
  }
  void start_thread( std::size_t n = 1 ) {
    while(n-->0) {
      threads.push_back(
        std::async( std::launch::async,
          [this]{
            while(auto task = tasks.pop_front()) {
              ++active;
              try{
                (*task)();
              } catch(...) {
                --active;
                throw;
              }
              --active;
            }
          }
        )
      );
    }
  }
private:
  std::vector<std::future<void>> threads;
  threaded_queue<std::packaged_task<void()>> tasks;
  std::atomic<std::size_t> active;
};

You give it how many threads either at construction or via start_thread.

You then queue_task. This returns a std::future that tells you when the task is completed.

As threads finish a task, they go to the threaded_queue and look for more.

When a threaded_queue is destroyed, it aborts all data in it.

When a thread_pool is destroyed, it aborts all future tasks, then waits for all of the outstanding tasks to finish.

Live example.

How to search through next available thread to do computation

2 Answers2

Linked