1

In an application I'm writing I have a threading model which is a simplified as the following:

  • a generator jthread (m_WorkerGenerator) is starting async tasks.
  • the multiple async tasks work until the the generator thread is stopped. For this, they use the reference of the std::stop_token (m_token) and they wait on the same condition_variable_any (m_cv), locked under the same mutex (m_mut). The deadlock happens before the std::jthread::request_stop() is called on the m_WorkerGenerator.
#include <iostream>
#include <condition_variable>
#include <thread>
#include <chrono>
#include <future>

using namespace std::chrono_literals;

class Foo {
std::condition_variable_any m_cv;
std::mutex m_mut;
std::stop_token m_token;
std::jthread m_WorkerGenerator;

void worker() {
    std::cout << "Worker thread start" << std::endl;

    while (true) {
        std::unique_lock lck{ m_mut };
        if (m_cv.wait_for(lck, m_token, 5ms, [=]() { return m_token.stop_requested(); })) {
            break;
        }
    }

    std::cout << "Worker thread stop" << std::endl;
}

public:

Foo() {
    m_WorkerGenerator = std::jthread{ [&](std::stop_token t) {
        m_token = t;

        std::vector<std::future<void>> futures;
        while (!t.stop_requested()) {

            auto fut = std::async(std::launch::async, [=]() {
                worker();
                });
            futures.emplace_back(std::move(fut));

            std::this_thread::sleep_for(5ms);
        }
    } };
}
};

int main()
{
    Foo f;
    std::this_thread::sleep_for(50ms); // Increase here if you can't reproduce
}

If I rewrite using the condition_variable_any::wait_for without the stop_token and signaling it manually from the stop_callback the deadlock is not happening.

#include <iostream>
#include <condition_variable>
#include <thread>
#include <chrono>
#include <future>

using namespace std::chrono_literals;

class Foo {
std::condition_variable_any m_cv;
std::mutex m_mut;
std::stop_token m_token;
std::jthread m_WorkerGenerator;

void worker() {
    std::cout << "Worker thread start" << std::endl;

    while (true) {
        std::unique_lock lck{ m_mut };
        if (m_cv.wait_for(lck, 5ms, [=]() { return m_token.stop_requested(); })) {
            break;
        }
    }

    std::cout << "Worker thread stop" << std::endl;
}

public:

Foo() {
    m_WorkerGenerator = std::jthread{ [&](std::stop_token t) {
        m_token = t;

        std::stop_callback(t, [=]() {
            m_cv.notify_all();
            });

        std::vector<std::future<void>> futures;
        while (!t.stop_requested()) {

            auto fut = std::async(std::launch::async, [=]() {
                worker();
                });
            futures.emplace_back(std::move(fut));

            std::this_thread::sleep_for(5ms);
        }
    } };
}
};

int main()
{
    Foo f;
    std::this_thread::sleep_for(5000ms);
}

1 Answers1

0

This is a typical Lost Wakeup.

In a nutshell and just to self-check, ask yourself each time, what makes you think that any worker will be executed at least once till the wait_for point while you endlessly create new ones in Foo::Foo ? They could and randomly do, but, again, what makes you think they must do so? Nobody guarantees this even with this sleep_for. So, formally this code could overflow memory at some point.

If case the article is not enough and you want more specifics and solution approach, see the:

Why 'wait with predicate' solves the 'lost wakeup' for condition variable?

Ahtisham
  • 9,170
  • 4
  • 43
  • 57
Damir Tenishev
  • 1,275
  • 3
  • 14
  • I suggest either posting a self-contained answer, or closing the question as a duplicate. – HolyBlackCat Mar 02 '23 at 18:34
  • @HolyBlackCat, this is not a duplicate. I added some extra information. Anyway, the part of the answer is in the question. Let's give the author some time for homework and if he accepts the answer we will ask him to post the final solution as "solidification". Meanwhile, I don't mind of any resolution. – Damir Tenishev Mar 02 '23 at 18:38
  • @HolyBlackCat, one extra edit that should help avoiding "false duplication". The answer is the first article with pinpoint in the paragraph beneath the link. And the link to the other SO article is just kind of extra help. – Damir Tenishev Mar 02 '23 at 19:26
  • I also consider this is not a duplicate. I'm not asking why a wait with a predicate and protected by mutex is needed, the samples already have a predicate. The emphasis of the question is if is possible and how to use the version of the wait_for with a stop_token, with multiple threads and the same condition variable. If instead of using the stop_token I would use some protected data flag (e.g. "is generator thread done") I would be reducing the question to https://stackoverflow.com/questions/57300380/can-multiple-threads-wait-on-the-same-condition-variable, but that's not what I'm asking. – Liviu Stancu Mar 02 '23 at 19:47
  • @DamirTenishev Please note that according to the answer to this https://stackoverflow.com/questions/74601373/is-owning-the-lock-required-to-request-a-stop-while-waiting-on-a-condition-varia, there is no need to acquire the mutex when signaling the condition_variable_any by calling request_stop, if that's what your suggesting with your answer. – Liviu Stancu Mar 02 '23 at 20:14
  • @LiviuStancu I don't get it. Why do you think that your variable `m_cv` will be somehow notified by `request_stop()`? – sklott Mar 02 '23 at 20:47
  • @sklott According to this https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0660r10.pdf page 20,: _The following wait functions will be notified when there is a stop request on the passed stop_token. In that case the functions return immediately, returning false if the predicate evaluates to false_ – Liviu Stancu Mar 02 '23 at 20:50
  • 1
    @LiviuStancu Ah. Ok, I missed that you use `m_token` in arguments. Sorry... – sklott Mar 02 '23 at 20:55