49

I'm encountering the following error at unpredictable times in a linux-based (arm) communications application:

pthread_mutex_lock.c:82: __pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed.

Google turns up a lot of references to that error, but little information that seems relevant to my situation. I was wondering if anyone can give me some ideas about how to troubleshoot this error. Does anyone know of a common cause for this assertion?

Thanks in advance.

Dave Causey
  • 13,098
  • 7
  • 30
  • 26
  • 3
    Having eliminated all other possibilities, I decided to invest in some RTFM. It appears I have been using the mutex in a way that is not officially supported. When a thread is waiting for some external stimulus, it waits on its mutex. The thread comes back to life when the mutex is released, always from _another_ thread. So the releasing thread is _never_ the mutex owner. I changed the implementation to use a condition variable. I don't know yet if this is the reason for my troubles. I've been (mis)using the mutex this way for years and haven't had any problems with it until now. – Dave Causey Jul 10 '09 at 02:42
  • 4
    Aren't `pthread_mutex`es (and mutexes in general) documented such that they must be unlocked by the same thread that locked them? The fact that it happens to work on other platforms is implementation-specific and not portable. – ephemient Jul 10 '09 at 14:53
  • I think that's what I said in my comment above. My implementation was misusing the mutex, so I changed it to make correct usage of a condition variable. All that remains is to confirm that this was in fact behind the intermittent assertion. – Dave Causey Jul 10 '09 at 21:36
  • I have the same error sometimes when my mutex is not initialized correctly --> use pthread_mutex_init – Chris Maes May 27 '15 at 07:41

8 Answers8

37

Rock solid for 4 days straight. I'm declaring victory on this one. The answer is "stupid user error" (see comments above). A mutex should only be unlocked by the thread that locked it. Thanks for bearing with me.

Dave Causey
  • 13,098
  • 7
  • 30
  • 26
  • 5
    Your solution only applies to unlocking then, right? I'm getting the same error when trying to lock it. – User Nov 27 '17 at 04:32
12

TLDR: Make sure you are not locking a mutex that has been destroyed / hasn't been initialized.

Although the OP has his answer, I thought I would share my issue in case anyone else has the same problem I did.

Notice that the assertion is in __pthread_mutex_lock and not in the unlock. This, to me, suggests that most other people having this issue are not unlocking a mutex in a different thread than the one that locked it; they are just locking a mutex that has been destroyed.

For me, I had a class (Let's call it Foo) that registered a static callback function with some other class (Let's call it Bar). The callback was being passed a reference to Foo and would occasionally lock/unlock a mutex that was a member of Foo.

This problem occurred after the Foo instance was destroyed while the Bar instance was still using the callback. The callback was being passed a reference to an object that no longer existed and, therefore, was calling __pthread_mutex_lock on garbage memory.

Note, I was using C++11's std::mutex and std::lock_guard<std::mutex>, but, since I was on Linux, the problem was exactly the same.

rationalcoder
  • 1,587
  • 1
  • 15
  • 29
  • 1
    To add to this, I had this happen when I was unlocking the same lock twice. The assertion error happened the next time I was attempting to acquire the lock, which made it a bit harder to find. – HashFail May 06 '19 at 16:58
4

I was faced with the same problem and google sent me here. The problem with my program was that in some situations I was not initializing the mutex before locking it.

Although the statement in the accepted answer is legitimate, I think it is not the cause of this failed assertion. Because the error is reported on pthread_mutex_lock (and not unlock).

Also, as always, it is more likely that the error is in the programmers source code rather than the compiler.

Shayan Pooya
  • 1,049
  • 1
  • 13
  • 22
2

In case you are using C++ and std::unique_lock, check this answer: https://stackoverflow.com/a/9240466/9057530

yyFred
  • 775
  • 9
  • 13
1

The quick bit of Googling I've done often blames this on a compiler mis-optimization. A decent summation is here. It might be worth looking at the assembly output to see if gcc is producing the right code.

Either that or you are managing to stomp on the memory used by the pthread library... those sort of problems are rather tricky to find.

Chris Arguin
  • 11,850
  • 4
  • 34
  • 50
  • I've been down the compiler mis-optimization path, which doesn't appear to be an issue in this case: assert (mutex->__data.__owner == 0); 154: e5953008 ldr r3, [r5, #8] 158: e3530000 cmp r3, #0 ; 0x0 15c: 1a0001a0 bne 7e4 <__pthread_mutex_lock+0x7e4> – Dave Causey Jul 09 '09 at 18:50
1

I was having same problem

in my case inside the thread i was connecting vertica db with odbc adding following setting to /etc/odbcinst.ini solved my problem. dont geting the exception so far.

[ODBC]
Threading = 1

credits to : hynek

ismail
  • 368
  • 4
  • 6
1

I have just fought my way through this one and thought it might help others. In my case the issue occured in a very simple method that locked the mutex, checked a shared variable and then returned. The method is an override of the base class which creates a worker thread.

The problem in this instance was that the base class was creating the thread in the constructor. The thread then started executing and the derived classes implementation of the method was called. Unfortunately the derived class had not yet completed constructing and the mutex in the derived class had uninitialised data as the mutex owner. This made it look like it was actually locked when it wasn't.

The solution is really simple. Add a protected method to the base class called StartThread(). This needs to be called in the derived classes constructor, not from the base class.

0

adding Threading=0 in /etc/odbcinst.ini file fixed this issue