How to use std::condition_variable in a loop

Question

I'm trying to implement some algorithm using threads that must be synchronized at some moment. More or less the sequence for each thread should be:

1. Try to find a solution with current settings.
2. Synchronize solution with other threads.
3. If any of the threads found solution end work.
4. (empty - to be inline with example below)
5. Modify parameters for algorithm and jump to 1.

Here is a toy example with algorithm changed to just random number generation - all threads should end if at least one of them will find 0.

#include <iostream>
#include <condition_variable>
#include <thread>
#include <vector>

const int numOfThreads = 8;

std::condition_variable cv1, cv2;
std::mutex m1, m2;
int lockCnt1 = 0;
int lockCnt2 = 0;

int solutionCnt = 0;

void workerThread()
{
    while(true) {
        // 1. do some important work
        int r = rand() % 1000;

        // 2. synchronize and get results from all threads
        {
            std::unique_lock<std::mutex> l1(m1);
            ++lockCnt1;
            if (r == 0) ++solutionCnt; // gather solutions
            if (lockCnt1 == numOfThreads) {
                // last thread ends here
                lockCnt2 = 0;
                cv1.notify_all();
            }
            else {
                cv1.wait(l1, [&] { return lockCnt1 == numOfThreads; });
            }
        }

        // 3. if solution found then quit all threads
        if (solutionCnt > 0) return;

        // 4. if not, then set lockCnt1 to 0 to have section 2. working again
        {
            std::unique_lock<std::mutex> l2(m2);
            ++lockCnt2;
            if (lockCnt2 == numOfThreads) {
                // last thread ends here
                lockCnt1 = 0;
                cv2.notify_all();
            }
            else {
                cv2.wait(l2, [&] { return lockCnt2 == numOfThreads; });
            }
        }

        // 5. Setup new algorithm parameters and repeat.
    }
}

int main()
{
    srand(time(NULL));

    std::vector<std::thread> v;
    for (int i = 0; i < numOfThreads ; ++i) v.emplace_back(std::thread(workerThread));
    for (int i = 0; i < numOfThreads ; ++i) v[i].join();

    return 0;
}

The questions I have are about sections 2. and 4. from code above.

A) In a section 2 there is synchronization of all threads and gathering solutions (if found). All is done using lockCnt1 variable. Comparing to single use of condition_variable I found it hard how to set lockCnt1 to zero safely, to be able to reuse this section (2.) next time. Because of that I introduced section 4. Is there better way to do that (without introducing section 4.)?

B) It seems that all examples shows using condition_variable rather in context of 'producer-consumer' scenario. Is there better way to synchronization all threads in case where all are 'producers'?

Edit: Just to be clear, I didn't want to describe algorithm details since this is not important here - anyway this is necessary to have all solution(s) or none from given loop execution and mixing them is not allowed. Described sequence of execution must be followed and the question is how to have such synchronization between threads.

This is not really a good example where you'd like to use condition variables. There's no need to sync between threads at all since that'd prevent a thread that gets more CPU time to use the time to do work. By syncing, you force all threads to work at the speed of the slowest thread, which is not good when you'd like to find a solution to a problem as quickly as possible. [Example @ rextester](https://rextester.com/ZEST41735) — Ted Lyngmo, Jun 28 '19 at 12:56
My suggestion is to not do this, You can simplify the entire program into `std::atmoic_bool found = false; void workerThread() { while (!found) { calculate; if(solution) { found = true; return; } setup_next_pass; }`. As soon one thread finds a solution all other threads will stop at then end of their current iteration and return. — NathanOliver, Jun 28 '19 at 13:05
@NathanOliver: I didn't want to describe algorithm details but this scenario of execution is a must. This is parallelized version of seqential algorithm which was: 1) run N-times frist part of alg. 2) then checked if there is solution(s) 3) if yes it was ended 4) if not some parameters are modified basing on other calculations from (1) and all is repeated. So I need synchronization at each loop execution to get all results from all threads. The time of running of algorithm is long enough and difference between slowest/fastest thread is minimal so synchronization is not harmful. — Krzysztof, Jun 28 '19 at 13:20
@TedLyngmo: This would work only in case if each thread can run independently of each other. Unfortunately I need synchronization after each loop execution to decide how to update parameters for next loop if needed (and this is based on information from all threads). That is why I just wanted focus only on sections (2) and (4) since the rest of procedure cannot be changed for now. Maybe toy example I presented is too 'toy' :-) As I wrote in comment above, section (1) takes quite lot of time and more or less it takes same time in each thread so time lost on synchronization in negligible. — Krzysztof, Jun 28 '19 at 13:40
@Krzysztof I see. Then your code looks like it could potentially work, except you're reading `solutionCnt` unguarded. I think putting that checker last inside the guarded scope would be better. — Ted Lyngmo, Jun 28 '19 at 13:52
Do you have access to boost::asio ? https://www.boost.org/doc/libs/1_66_0/doc/html/boost_asio/reference/io_context.html — Hui, Jun 28 '19 at 16:04

score 1 · Answer 1 · answered Jun 28 '19 at 12:50

A) You could just not reset the lockCnt1 to 0, just keep incrementing it further. The condition lockCnt2 == numOfThreads then changes to lockCnt2 % numOfThreads == 0. You can then drop the block #4. In future you could also use std::experimental::barrier to get the threads to meet.

B) I would suggest using std::atomic for solutionCnt and then you can drop all other counters, the mutex and the condition variable. Just atomically increase it by one in the thread that found solution and then return. In all threads after every iteration check if the value is bigger than zero. If it is, then return. The advantage is that the threads do not have to meet regularly, but can try to solve it at their own pace.

score 0 · Answer 2 · answered Jun 28 '19 at 22:30

Out of curiosity, I tried to solve your problem using std::async. For every attempt to find a solution, we call async. Once all parallel attempts have finished, we process feedback, adjust parameters, and repeat. An important difference with your implementation is that feedback is processed in the calling (main) thread. If processing feedback takes too long — or if we don't want to block the main thread at all — then the code in main() can be adjusted to also call std::async.

The code is supposed to be quite efficient, provided that the implementation of async uses a thread pool (e. g. Microsoft's implementation does that).

#include <chrono>
#include <future>
#include <iostream>
#include <vector>

const int numOfThreads = 8;

struct Parameters{};
struct Feedback {
    int result;
};

Feedback doTheWork(const Parameters &){
    // do the work and provide result and feedback for future runs
    return Feedback{rand() % 1000};
}

bool isSolution(const Feedback &f){
    return f.result == 0;
}

// Runs doTheWork in parallel. Number of parallel tasks is same as size of params vector
std::vector<Feedback> findSolutions(const std::vector<Parameters> &params){

    // 1. Run async tasks to find solutions. Normally threads are not created each time but re-used from a pool
    std::vector<std::future<Feedback>> futures;
    for (auto &p: params){
        futures.push_back(std::async(std::launch::async,
                                    [&p](){ return doTheWork(p); }));
    }

    // 2. Syncrhonize: wait for all tasks
    std::vector<Feedback> feedback(futures.size());
    for (auto nofRunning = futures.size(), iFuture = size_t{0}; nofRunning > 0; ){

        // Check if the task has finished (future is invalid if we already handled it during an earlier iteration)
        auto &future = futures[iFuture];
        if (future.valid() && future.wait_for(std::chrono::milliseconds(1)) != std::future_status::timeout){
            // Collect feedback for next attempt
            // Alternatively, we could already check if solution has been found and cancel other tasks [if our algorithm supports cancellation]
            feedback[iFuture] = std::move(future.get());
            --nofRunning;
        }

        if (++iFuture == futures.size())
            iFuture = 0;
    }
    return feedback;
}

int main()
{
    srand(time(NULL));

    std::vector<Parameters> params(numOfThreads);
    // 0. Set inital parameter values here

    // If we don't want to block the main thread while the algorithm is running, we can use std::async here too
    while (true){
        auto feedbackVector = findSolutions(params);
        auto itSolution = std::find_if(std::begin(feedbackVector), std::end(feedbackVector), isSolution);

        // 3. If any of the threads has found a solution, we stop
        if (itSolution != feedbackVector.end())
            break;

        // 5. Use feedback to re-configure parameters for next iteration
    }

    return 0;
}

How to use std::condition_variable in a loop

2 Answers2