boost::interprocess::interprocess_condition::wait does not atomically unlock mutex while waiting

Question

As the current post title saying about it, the boost boost::interprocess::interprocess_condition::wait is suppose to atomically unlock mutex while it waits, but it doesn't.

In the following code:

boost::interprocess::scoped_lock< boost::interprocess::interprocess_mutex > state_access_lock(impl->state->state_access_mut);
impl->state->state_access_cond.wait(state_access_lock);

In VS2010 into debugging mode i pressed pause and was surprised when i saw that state_access_lock is still locked while waiting.

But that's not what boost's doc is saying here.

Does anybody have a suggestion?

Thanks.

Do you have an actual observation of the behavior of code or just the value of some variable in the debugger? — Sebastian Redl, Jun 13 '13 at 15:08
I realized it first of the behavior. Because my second thread which is supposed to write then call notify, was waiting for after the mutex to be released eternally. Only after that i decided to check what is going on with my mutex into the first thread using VS debug mode. — Me as is, Jun 13 '13 at 15:30
Aside from it being very unlikely that such a fundamental brokenness would go unnoticed, every single codepath of the various condition variables (by the way, you should specify which one you're using) goes through an unlock of the mutex lock. So the error must be somewhere else. Try posting more context. — Sebastian Redl, Jun 13 '13 at 15:59
I traced the execution code inside the wait method of boost::interprocess::interprocess_condition and saw that the mutex itself is unlocked by calling `mut.unlock()` but the scoped_lock itself still have is_locked set to true. So i'm confusing, if i have a scoped_lock like this: `scoped_lock lk(mut)` what is the difference by calling `mut.unlock()` or `lk.unlock()` ? — Me as is, Jun 13 '13 at 17:12

Sebastian Redl · Answer 1 · 2013-06-14T15:40:16.970

0

Based on the comments so far, I think I can infer an answer.

Don't trust the members of the scoped_lock you pass to interprocess_condition::wait(). The contract of interprocess_condition (unlike interprocess_condition_any) states that you can only use it with a lock for an interprocess_mutex. Knowing this, the condition variable pulls the internal mutex out of your lock to do its job more efficiently than if it knew nothing about the lock.

So when it comes to unlocking the mutex, it doesn't call unlock() on your scoped_lock, but directly on the mutex. This is fine for the internal implementation; don't do this at home. Bad things happen if you don't re-lock the mutex before the lock goes out of scope.

In other words, the behavior you see in the debugger is not indicative of a problem. If you have a deadlock, it must be somewhere else.

Edit

The condition variable stuff in the actual code given looks fine to me. I find the interactions with start_mut a bit strange. Are you sure that part is not problematic?

edited Jun 14 '13 at 15:40

answered Jun 14 '13 at 09:51

Sebastian Redl

69,373
8
123
157

Ok, so instead of this `boost::interprocess::scoped_lock< boost::interprocess::interprocess_mutex > state_access_lock(impl->state->state_access_mut); impl->state->state_access_cond.wait(state_access_lock);` i should write: `state->state_access_mut.lock(); state->state_access_cond.do_wait(state->state_access_mut);` then manipulate `state_access_lock` it self instead of put it into a `scoped_lock` am i right? – Me as is Jun 14 '13 at 13:15
No, the code as written looks fine actually. It's just that the condition variable unlocks the underlying mutex directly, not via the lock object, so you see confusing values in the debugger. But that should be the *only* effect. If you're deadlocking, you have a different problem. Why don't you post the code for both threads? – Sebastian Redl Jun 14 '13 at 13:56
I posted the code using the "Add Another Answer" but the code goes above your today's answer. – Me as is Jun 14 '13 at 15:24
I use the start_mut to ensure that the first thread will call `wait` before the second one call `notify_all()`. I did it, because my both threads are started from main thread, and i have no way to know if the first one called `wait()` before the second started. – Me as is Jun 14 '13 at 19:07

score 0 · Answer 2 · answered Jun 14 '13 at 15:23

Ok this is the first thread:

void CSharedMemory::start(start_mode mode)
{
bool start_mut_locked = true;
impl->running = true;
impl->mode = mode;

stateMetaStruct* state = impl->proc_state;

boost::interprocess::scoped_lock< boost::interprocess::interprocess_mutex > state_access_lock(state->state_access_mut);

while(impl->running)
{
    state->data_written = false;
    while(!state->data_written)
    {
        if(start_mut_locked)
        {
            // We can now unlock and let other threads to send data.
            impl->start_mut.unlock();
            start_mut_locked = false;
        }

        state->state_access_cond.wait(state_access_lock); // wait here upon sharedmemory's state change
        boost::interprocess::offset_ptr< stateMetaStruct > s = impl->shm_obj.find< stateMetaStruct >(boost::interprocess::unique_instance).first;
        state = s.get();

        if(!state->data_written)
        {
            // Spurious wakeup.
            glm_debug("Spurious wakeup.");
        }

        if(this == state->data_written_by_proccess)
        {
            state->data_written = false;
            glm_debug("Ignoring my proper event.");
        }
    }

    if(impl->running)
    {
        // Got action from other process.
        const interprocess_actions state_action = state->action;

        if(DO_STOP == state_action) {
        }
        else if(DUMP_USERS_REQUEST == state_action) {
            impl->stateChangedListener->onDumpUsersRequest();
        }
        else if(DUMP_USERS_REPLY == state_action) {
        }
        else {
            glm_err("Unexpected state.");
        }
    }
   }
}

The second thread try to send data using this method:

void CSharedMemory::sendDumpUsersRequest()
{
// Ensure shm is started.
boost::mutex::scoped_lock lk(impl->start_mut);

glm_debug("%s", __FUNCTION__);

boost::interprocess::offset_ptr< stateMetaStruct > s = impl->shm_obj.find< stateMetaStruct >(boost::interprocess::unique_instance).first;
stateMetaStruct* state = s.get();

boost::interprocess::scoped_lock< boost::interprocess::interprocess_mutex > state_access_lock(state->state_access_mut);

state->action = DUMP_USERS_REQUEST;

state->data_written = true;
state->data_written_by_proccess = this;

// Send request.
state->state_access_cond.notify_all();
}

The behavior is, the second thread block when trying to acquire the scoped_mutex because the first one is waiting on it.

boost::interprocess::interprocess_condition::wait does not atomically unlock mutex while waiting

2 Answers2