0

This code is simplification of real project code. Main thread create worker thread and wait with std::condition_variable for worker thread really started. In code below std::condition_variable wakes up after current_thread_state becomes "ThreadState::Stopping" - this is the second notification from worker thread, that is the main thread did not wake up after the first notification, when current_thread_state becomes "ThreadState::Starting". The result was deadlock. Why this happens? Why std::condition_variable not wake up after first thread_event.notify_all()?

int main()
{
  std::thread thread_var;
  struct ThreadState {
    enum Type { Stopped, Started, Stopping };
  };
  ThreadState::Type current_thread_state = ThreadState::Stopped;
  std::mutex thread_mutex;
  std::condition_variable thread_event;
  while (true) {
    {
      std::unique_lock<std::mutex> lck(thread_mutex);
      thread_var = std::move(std::thread([&]() {
        {
          std::unique_lock<std::mutex> lck(thread_mutex);
          cout << "ThreadFunction() - step 1\n";
          current_thread_state = ThreadState::Started;
        }
        thread_event.notify_all();

        // This code need to disable output to console (simulate some work).
        cout.setstate(std::ios::failbit);
        cout << "ThreadFunction() - step 1 -> step 2\n";
        cout.clear();

        {
          std::unique_lock<std::mutex> lck(thread_mutex);
          cout << "ThreadFunction() - step 2\n";
          current_thread_state = ThreadState::Stopping;
        }
        thread_event.notify_all();
      }));

      while (current_thread_state != ThreadState::Started) {
        thread_event.wait(lck);
      }
    }

    if (thread_var.joinable()) {
      thread_var.join();
      current_thread_state = ThreadState::Stopped;
    }
  }
  return 0;
}
Arkanosis
  • 2,229
  • 1
  • 12
  • 18
Soler
  • 3
  • 2

2 Answers2

3

Once you call the notify_all method, your main thread and your worker thread (after doing its work) both try to get a lock on the thread_mutex mutex. If your work load is insignificant, like in your example, the worker thread is likely to get the lock before the main thread and sets the state back to ThreadState::Stopped before the main thread ever reads it. This results in a dead lock.

Try adding a significant work load, e.g.

std::this_thread::sleep_for( std::chrono::seconds( 1 ) );

to the worker thread. Dead locks are far less likely now. Of course, this is not a fix for your problem. This is just for illustrating the problem.

Markus Mayr
  • 4,038
  • 1
  • 20
  • 42
  • Yes. You right. If comment **cout.setstate(std::ios::failbit)** in work thread func, then all ok. But i don't fully understand, why **thread_event.wait(lck)** not wake up after first "thread_event.notify_all()", because c++ documentation says: _The wait operations atomically release the mutex and suspend the execution of the thread. When the condition variable is notified, the thread is awakened, and the mutex is reacquired._ – Soler Aug 03 '15 at 18:03
  • So the easiest solution in this case is changing `while (current_thread_state != ThreadState::Started)` to `while (current_thread_state == ThreadState::Stopped)` – stefaanv Aug 03 '15 at 20:57
  • @stefaanv, we must find the state *ThreadState::Started*, guaranteed to know that the working thread is running. Code in the working thread between the first and second call *thread_event.notify_all()* can execute arbitrary time or wait for some event. – Soler Aug 04 '15 at 05:00
  • @soler: okay, I was posing a solution to the posted problem, not to your requirement. Anyway, the problem is that you're not guaranteed that you're able to check the exact state, so somehow you'll have to deal with that.. – stefaanv Aug 04 '15 at 05:50
  • @Soler: Regarding your question in your first comment: Yes, the mutex is acquired, basically as if you would call the `lock` method on it. But there is a race condition between the two threads that try to acquire a lock on the mutex. It is undefined which of the threads gets the lock. Regarding your problem, you can use a second `condition_variable` or use polling. But there might be a better solution when analysing your actual requirements from a higher point of view. – Markus Mayr Aug 04 '15 at 06:01
  • @MarkusMayr, thank you. Your answer and comments contain sufficient information to fully understand the problem. – Soler Aug 10 '15 at 18:25
1

You have two threads racing: one writes values of current_thread_state twice, another reads the value of current_thread_state once.

It is indeterminate whether the sequence of events is write-write-read or write-read-write as you expect, both are valid executions of your application.

Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271