2

I have an application that loads a DLL with a class that handles jobs in a queue. In order to keep it thread safe, a mutex is locked whenever the queue is modified. When the application exits and the destructor is called, the mutex is locked in order to clear the queue.

However, when I load this DLL in Python, create an instance of the object, and call exit() (in Python) an exception is thrown when the mutex tries to lock:

Microsoft Visual Studio C Runtime Library has detected a fatal error in python.exe.

I have simplified the destructor down to just creating a mutex locally and trying to lock it, and can still reproduce the issue:

QueueHandler::~QueueHandler(void)
{
    mutex mut; // in reality, this is a member of the class and there are actual operations between lock and unlock
    mut.lock(); // exception here
    mut.unlock();
}

If I take my unmodified code and simply remove the lock around the queue operation, it works fine.

Here is the seemingly relevant section of the call stack:

KernelBase.dll!RaiseException() Unknown
msvcr120.dll!_CxxThrowException(void * pExceptionObject, const _s__ThrowInfo * pThrowInfo) Line 154 C++
msvcr120.dll!Concurrency::details::SchedulerBase::SchedulerBase(const Concurrency::SchedulerPolicy & policy) Line 149   C++
msvcr120.dll!Concurrency::details:: SchedulerBase::CreateWithoutInitializing(const Concurrency::SchedulerPolicy & policy) Line 285  C++
msvcr120.dll!Concurrency::details:: SchedulerBase::CreateContextFromDefaultScheduler() Line 571 C++
msvcr120.dll!Concurrency::details::SchedulerBase::CurrentContext() Line 404 C++
[Inline Frame] msvcr120.dll!Concurrency::details::LockQueueNode::{ctor (unsigned int) Line 619  C++
msvcr120.dll!Concurrency::critical_section::lock() Line 1031    C++
msvcp120.dll!mtx_do_lock(_Mtx_internal_imp_t * * mtx, const xtime * target) Line 67 C++
--> MyApplication.dll!QueueHandler::~QueueHandler() Line 106    C++
MyApplication.dll!_CRT_INIT(void * hDllHandle, unsigned long dwReason, void * lpreserved) Line 416  C
MyApplication.dll!__DllMainCRTStartup(void * hDllHandle, unsigned long dwReason, void * lpreserved) Line 522    C
ntdll.dll!LdrShutdownProcess()  Unknown
ntdll.dll!RtlExitUserProcess()  Unknown
msvcr100.dll!doexit(int code, int quick, int retcaller) Line 621    C
python27.dll!000000001e13be65() Unknown
...
python27.dll!000000001e043494() Unknown
python.exe!000000001d00119e()   Unknown

Questions:

  1. If this code works when I exit my app normally (close the GUI), why would it be different when I exit() from Pyton?
  2. Is there a "more correct" way to exit from Python?
  3. Could this be related to the type of mutex/lock being used?
  4. Do I even need to lock the section with the queue operations in my destructor? Or is it okay to delete the objects in the queue without a lock?

Edit: MCVE: QueueHandlerApp - Run the app or run script.py to demonstrate the issue.

SFBA26
  • 870
  • 3
  • 12
  • 24
  • 2
    Having a lock in a destructor is a definite no-no. –  Jun 14 '17 at 21:50
  • @CaptainObvlious, so you're saying the mutex object is being destroyed as it's being locked? If I shouldn't use a lock in the destructor, then should I clean up the queue without a lock? Or is there a different approach to cleaning up the queue? – SFBA26 Jun 14 '17 at 21:55
  • You are creating a completely new lock in the destructor, then you lock it and unlock it. What on earth is this supposed to achieve (that is useful)? – Jesper Juhl Jun 14 '17 at 22:02
  • @JesperJuhl Nothing, I was just trying simplify the code as much as possible to focus on where the problem is. In reality, the mutex is a member of the class and there are actual operations between the lock and unlock. – SFBA26 Jun 14 '17 at 22:04
  • 1
    @SFBA26 then perhaps you should post a [mcve] so that we have an actual chance of helping you with your problem. Just posting code snippets that don't do anything and that we can't compile/run/test is fairly useless. – Jesper Juhl Jun 14 '17 at 22:06
  • "why would it be different when I exit()" - `exit()` doesn't call destructors of local objects (amongst other things). It's a rather brutal way of terminating a program compared to returning from `main`. – Jesper Juhl Jun 14 '17 at 22:11
  • Why are you locking anything in the destructor? – Galik Jun 14 '17 at 22:14
  • Doing kernel calls during dll shutdown is not supported behaviour. There was a series of post on "The Old New Thing" blog regarding this very issue. Will try and find them ... Here's one. https://blogs.msdn.microsoft.com/oldnewthing/20100122-00/?p=15193 – Richard Critten Jun 14 '17 at 22:14
  • Are you locking the queue from within the object you are deleting from the queue? – Galik Jun 14 '17 at 22:15
  • @Galik, no, the the queue handler locks the queue, which contains "job" objects. Part of the question was whether I need to lock the queue in the destructor, and it appears the answer is no. – SFBA26 Jun 14 '17 at 22:23
  • @RichardCritten thanks for the info! – SFBA26 Jun 14 '17 at 22:28
  • @JesperJuhl, you're right. I added a MVCE. Sorry I didn't add it originally. – SFBA26 Jun 15 '17 at 19:15
  • 1
    @NeilButterworth Nothing wrong with locking in destructors. This is something `operator delete` already does. Any third party allocator needs this too. The issue here is with broken `std::mutex` implementation which gets worse due to broken module deinitialization on Windows. –  Feb 11 '18 at 18:27

1 Answers1

2

Some people, when confronted with a problem, think, "I know, I'll use lazy initialization." Now they have two problems.

This is a bug in MSVC implementation of std::mutex. Before MSVC14 std::mutex and std::condition_variable would perform some internal initializations lazily. This alone is bad, but becomes even worse due to how modules are deinitialized on Windows.

The bug was fixed in MSVC14 (Visual Studio 2015) - std::mutex was rewritten to use SRWLock internally. SRWLock is a simple primitive without additional dependencies. It relies only on atomic instructions and Keyed Events syscalls. As the kernel is isolated from the userspace, SRWLock's should work seemlessly regardless of where they are used.

Seems like you are using MSVC12 (Visual Studio 2013). You should switch to MSVC14 (Visual Studio 2015) or use Boost.Thread instead.

There are actually many issues with std::mutex on MSVC12 and earlier. Some are related to the actual implementation used in CRT, others (as I heard) are caused by bugs in Windows 7 and were fixed in Windows 8.