Program abort hangs the named mutex

Question

I have several processes but only one should be running at the time. This means that let's say the Process1 is running and if the Process2 get launched, then Process2 should wait until Process1 is complete. I am considering the boost named_mutex for this purpose. In order to avoid a scenario where mutex may not get released if some exception is thrown, it looks like boost::lock_guard could be useful. I came up with the following simplified version of the code.

#include <iostream>
#include <boost/interprocess/sync/named_mutex.hpp>
#include <boost/thread.hpp>
#include <chrono>
#include <thread>

using namespace boost::interprocess;
#pragma warning(disable: 4996)
int main()
{


    std::cout << "Before taking lock" << std::endl;

    named_mutex mutex(open_or_create, "some_name");
    boost::lock_guard<named_mutex> guard(mutex) ;

    // Some work that is simulated by sleep
    std::cout << "now wait for 10 second" << std::endl;
    std::this_thread::sleep_for(std::chrono::seconds(10));

    std::cout << "Hello World";


}

So far, so good. When this program is running, I hit Ctl+C so the program gets aborted (kind of simulation of program crashed, unhandled exception etc). After that when I run the application, the program gets hung on the following line of code.

named_mutex mutex(open_or_create, "some_name");
boost::lock_guard<named_mutex> guard(mutex) ;

If I change the mutex name, then it works fine without getting hung. However, it looks like mutex named some_name is somehow "remembered" on the machine in some sort of bad state. This results in any application that tries to acquire a mutex with name some_name gets hung on this line of code. If I change this mutex name to let' say some_name2, the program works fine again.

Can someone please explain what is causing this behavior?
How can I reset the behavior for this particular mutex?
Most importantly, how to avoid this scenario in a real application?

It's saving state somewhere on the filesystem. See https://stackoverflow.com/q/20379817/398091 — ppetraki, Apr 24 '19 at 01:41

Michael Kenzel · Accepted Answer · 2019-04-24T03:56:23.540

As explained in this answer to the question linked by @ppetraki above, boost::interprocess:named_mutex, unfortunately, uses a file lock on Windows rather than an actual mutex. If your application terminates abnormally, that file lock will not be removed from the system. This is actually subject to an open issue.

Looking at the source code, we see that, if BOOST_INTERPROCESS_USE_WINDOWS is defined, internal_mutex_type maps to a windows_named_mutex which, internally, uses a windows_named_sync, which seems to just be using a file lock in the end. I'm not sure what exactly is the rationale of this choice of implementation. Whatever it may be, there does not seem to be any way to get boost::interprocess to use a proper named mutex on Windows. I would suggest to simply create a named mutex yourself using CreateMutex, for example:

#include <type_traits>
#include <memory>
#include <stdexcept>
#include <mutex>
#include <iostream>

#define NOMINMAX
#define WIN32_LEAN_AND_MEAN
#include <windows.h>

struct CloseHandleDeleter { void operator ()(HANDLE h) const { CloseHandle(h); } };

class NamedMutex
{
    std::unique_ptr<std::remove_pointer_t<HANDLE>, CloseHandleDeleter> m;

public:
    NamedMutex(const wchar_t* name)
        : m(CreateMutexW(nullptr, FALSE, name))
    {
        if (!m)
            throw std::runtime_error("failed to create mutex");
    }

    void lock()
    {
        if (WaitForSingleObject(m.get(), INFINITE) == WAIT_FAILED)
            throw std::runtime_error("something bad happened");
    }

    void unlock()
    {
        ReleaseMutex(m.get());
    }
};

int main()
{
    try
    {
        NamedMutex mutex(L"blub");

        std::lock_guard lock(mutex);

        std::cout << "Hello, World!" << std::endl;
    }
    catch (...)
    {
        std::cerr << "something went wrong\n";
        return -1;
    }

    return 0;
}

Very nice!!!. Is there any significance of the "Global\\: in the name you used "Global\\blub"? Or this name could literally be anything like "blah-blah"?? — whoami, Apr 24 '19 at 03:40
@BKS The Global\\ prefix will add the mutex to the global [namespace](https://learn.microsoft.com/en-us/windows/desktop/TermServ/kernel-object-namespaces). Most likely, this will not actually be needed in your application… — Michael Kenzel, Apr 24 '19 at 03:51
@BKS, in terms of implementation in NT systems (all since XP), Windows creates non-device named kernel objects in the object-manager directory "\Sessions\\BaseNamedObjects". There's a "Global" symlink in this directory to the global "\BaseNamedObjects" directory. This global named-object directory is also used for session 0, the services session. Since this is really an NT path, we have to use backlash, i.e. `"Global\\blub"`. We can't use `"Global/blub"` since forward slash is just a name character in NT paths, except for being reserved in filesystems. — Eryk Sun, Apr 24 '19 at 19:48

David Schwartz · Answer 2 · 2019-04-24T01:51:39.423

1

Can someone please explain what is causing this behavior?

The mutex is global.

How can I reset the behavior for this particular mutex?

Call boost::interprocess::named_mutex::remove("mutex_name");

Most importantly, how to avoid this scenario in a real application?

It depends on what your outer problem is. Perhaps a more sensible solution is to use a file lock instead. A file lock will go away when a process is destroyed.

Updates:

I understand mutex is global but what happens with that mutex that causes the program to hang?

The first program acquired the mutex and never released it so the mutex is still held. Mutexes are typically held while shared state is put into an inconsistent state, so automatically releasing the mutex would be disastrous.

How can I determine if that mutex_name is in a bad state so its time to call the remove on it?

In your case you really can't because you picked the wrong tool for the job. The same logic you would use to tell if the mutex was in a sane state would just solve your whole problem, so the mutex just made things harder. Instead, use a file lock. It may be useful to write the process name and process ID into the file to help in troubleshooting.

edited Apr 24 '19 at 01:51

answered Apr 24 '19 at 01:46

David Schwartz

179,497
17
214
278

I understand mutex is global but what happens with that mutex that causes the program to hang? – whoami Apr 24 '19 at 01:49
How can I determine if that mutex_name is in a bad state so its time to call the remove on it? – whoami Apr 24 '19 at 01:49
I just want to avoid the scenario where many programs from the suite of application are running at the same time. If one of the programs is running, I want other programs to wait until the first one finishes. – whoami Apr 24 '19 at 01:52
Can you provide any links about file lock approach? – whoami Apr 24 '19 at 01:54
1

Look [here](https://www.boost.org/doc/libs/1_70_0/doc/html/boost/interprocess/file_lock.html). – David Schwartz Apr 24 '19 at 01:54
Regarding "The first program acquired the mutex and never released it", my impression was that lock_guard protect against those scenarios. – whoami Apr 24 '19 at 01:54
1

@BKS Your impression is incorrect. A `lock_guard` releases a lock when its destructor runs. If the destructor doesn't run, the lock doesn't get released. If you, for example, allocate a `lock_guard` with `new` and don't `delete` it, the lock won't be released. In your case, the destructor does not get invoked because the process terminates without destroying the `lock_guard`. – David Schwartz Apr 24 '19 at 01:55
Thanks for useful information and links. I did some reading on file lock and have a couple of questions. How does using it increases the reliability compared to the boost::named_mutex I am currently using? What will happen with file lock mechanism if the application crashes as I describd in the question? – whoami Apr 24 '19 at 02:42
I don't see how using a file lock would help here. Quite to the contrary, it would seem that the fact that a file lock is being used (see question liked by @ppetraki in his comment above) is the very cause of the problem to begin with!? – Michael Kenzel Apr 24 '19 at 03:00
@BKS When a process crashes, any file locks it holds are automatically released. – David Schwartz Apr 24 '19 at 14:00

Program abort hangs the named mutex

2 Answers2