0

UPDATE 21.02.2020: Holding the lock while notifying doesn't actually help. As I understand so far, the condition variable is left invalid in the shared memory by the waiting process.

So I have this application using boost interprocess to share memory, and the access to it is synced using an interprocess condition variable. I am using boost 1.62 on Windows. I am compiling using Microsoft Windows Build Tools 2015.

What happens is that when I terminate the waiting process with a Ctrl-C, the notifying process gets stuck in the notify call.

Here's a demo program that allows reproducing the issue. You have to run the executable once without any argument to start the waiting process and once more with some argument to start the notifying process. Then kill the first process. Sometimes you will observe that the printing stops at "Entering notify".

#include <boost/interprocess/shared_memory_object.hpp>
#include <boost/interprocess/mapped_region.hpp>
#include <boost/interprocess/sync/scoped_lock.hpp>
#include <boost/interprocess/sync/interprocess_mutex.hpp>
#include <boost/interprocess/sync/interprocess_condition.hpp>

#include <iostream>

struct shared_data
{
   boost::interprocess::interprocess_mutex mutex;
   boost::interprocess::interprocess_condition condition;

   bool test_bool = false;
};


int main(int argc, char *argv[])
{
    using namespace boost::interprocess;

    if (argc == 1) {
        struct shm_remove
        {
            shm_remove() {
                shared_memory_object::remove("MySharedMemory");
            }
            ~shm_remove() {
                shared_memory_object::remove("MySharedMemory");
            }
        } remover;

        shared_memory_object shm(create_only, "MySharedMemory", read_write);

        shm.truncate(sizeof(shared_data));
        mapped_region region(shm, read_write);
        void* addr = region.get_address();
        shared_data* data = new (addr) shared_data;

        while (true) {
            scoped_lock<interprocess_mutex> lock(data->mutex);
            while (!data->test_bool) {
                data->condition.wait(lock);
            }
            std::cout << "test_bool became true" << std::endl;
            data->test_bool = false;
        }
    }
    else {
        shared_memory_object shm(open_only, "MySharedMemory", read_write);
        mapped_region region(shm, read_write);
        shared_data* data = static_cast<shared_data*>(region.get_address());
        while (true) {
            {
                scoped_lock<interprocess_mutex> lock(data->mutex);
                data->test_bool = true;
            }
            std::cout << "Entering notify" << std::endl;
            data->condition.notify_one();
            std::cout << "Exiting notify" << std::endl;
        }
    }
}

(Of course, killing while waiting is harsh, but I as far as I've debugged it, the wait call is cleaned up after the signal)

If I keep the lock acquired while calling notify_one, the issue does not manifest. However, I was expecting not to be a need to keep the lock acquired while notifying, in the spirit of the c++ threading implementation. I haven't found any specification on this point in the documentation, only the example, which does indeed keep the lock acquired.

Now, given that I have a solution to my problem, my questions are:

  1. Is the need to have the lock acquired while notifying the expected and only correct usage, or is it a bug?
  2. If it is the expected usage, why?
ianos
  • 143
  • 1
  • 8

1 Answers1

0

You don't have to hold the lock when calling notify, but in most of the cases you should still do it, because otherwise some threads (or in your case processes) could miss the notification. Consider the following scenario:

  • process 1 acquires the lock and checks the condition, but is preempted before calling condition.wait
  • process 2 calls condition.notify_one - but there is no process to be notified
  • you kill process 2
  • now process 1 finally calls condition.wait - and waits forever.

By acquiring the lock before calling notify, you can ensure that the other process has already called wait and therefore cannot miss the notification. This also holds for std::condition_variable, not only your interprocess example.

There are a few situations where this might not be an issue (e.g., because you do not wait forever, but only for a limited time), but you should be very careful there.

mpoeter
  • 2,574
  • 1
  • 5
  • 12
  • I can see that there are cases where you have to hold the lock while notifying. But there are situations where you don't have to, and I think my example is one of them. Nevertheless, I find it wrong behavior that the notify call blocks when called without a lock (in the scenario above). So the question remains, is this a bug or is this some limitation of the library? – ianos Jan 31 '20 at 13:37