1

Design:

  • A singleton that contains a 'recursive' mutex resource.
  • 2 threads use this singleton to update/manage data.
  • Singleton is created whichever thread tries to access it first.
  • Singleton creation has a global lock to ensure we call mutex attr init and mutex init only once.

Sample code: Both threads have identical flow (just different data) and will call funcX() first

instance() has a global mutex lock() within to ensure only 1 instance of A gets created. It has also has addition (!_instance) check soon after the lock to make sure we do not create the instance again.

class A
{
public:
    void funcA();
    void funcB();
private:
    <members>
    <boost::recursive_mutex> m; 
};

void funcA()
{
    m.lock();
    <Do something>
    m.unlock();
    return;
}

void funcB()
{
    m.lock()
    <Do something>
    m.unlock()
    return;
}


void funcX()
{
    Singleton::instance().funcA();
    return;
}

void funcY()
{
    Singleton::instance().funcB();
    return;
}


========================================================================

A& Singleton::instance()
{
   <Global mutex lock>
   if (!_instance)
   {
     createInstance();
   }
   <Global mutex unlock>
   return _instance;
}

Problem:

Very rarely, the first mutex lock call does not increment the __count(0) variable. Although the __owner (thread id), __nusers (1), __lock (2) attributes are all updated. Whenever I try to log __kind attribute, the issue does not happen.

Initial findings:

When the issue happens, both threads are trying to initialize the singleton (also mutex). Because of global lock within singleton creation, only 1 thread proceeds and creates the mutex and initializes it to the recursive type. Then the thread that locks the mutex is looking at outdated memory and leads to thinking mutex type is Normal? __kind = 0. Mutex lock returns a success. And when the subsequent unlock is called, mutex type is now updated as recursive and because of pthread unlock does not have 0 checks, it ends up decrementing the __count to be INT_MAX.

else if (__builtin_expect (PTHREAD_MUTEX_TYPE (mutex)
                  == PTHREAD_MUTEX_RECURSIVE_NP, 1))
    {
      /* Recursive mutex.  */
      if (mutex->__data.__owner != THREAD_GETMEM (THREAD_SELF, tid))
    return EPERM;

      if (--mutex->__data.__count != 0)
    /* We still hold the mutex.  */
    return 0;
      goto normal;
    }  

Unlock also returns success and the mutex is never released, causing the other thread to be in wait state forever.

What are the possible reasons for this scenario to happen? Can the __kind be corrupted somehow?

Harry
  • 21
  • 4
  • Can you please update the question with your code that accesses the singleton? In particular the code that decides (presumably without a lock held) that the singleton needs to be created is fraught with peril. – caf May 28 '20 at 06:25
  • I have added some code snippets. Hope that answers your question. – Harry May 28 '20 at 07:33
  • That does help a lot. I can't see any issues there: as long as your real code for `Singleton::instance()` doesn't also check `if (_instance)` outside of the global lock? – caf May 29 '20 at 01:40
  • @caf Yes, I can confirm there is no if `if (_instance)` check – Harry May 29 '20 at 03:06
  • If you don't get an answer from this, the next step is probably to boil it down to a minimal compilable example that still demonstrates the issue. – caf May 29 '20 at 03:17
  • @caf Yes, I have considered that and have a sample application. Yet to reproduce the issue though. I shall let you know if I mange to see it with a standalone application. Thank you for the responses. – Harry Jun 02 '20 at 04:58

0 Answers0