2

I am working with custom embedded real-time OS having some home-brew threading and synchronization facility. The mutual exclusion is implemented in a way similar to the outlined below:

typedef int Mutex;
#define MUTEX_ACQUIRED    1
#define MUTEX_RELEASED    0

bool AquireMutex(Mutex* pMutex)
{
    bool Ret;
    // Assume atomic section here (implementation specific)
    if( *pMutex == MUTEX_RELEASED )
    {
         *pMutex = MUTEX_ACQUIRED;
         Ret = true;
    } 
    else
    {
         Ret = false;
    }
    // Atomic section end
    return Ret;
}

void ReleaseMutex(Mutex* pMutex)
{
    // Assume atomic section here (implementation specific)
    *pMutex = MUTEX_RELEASED;
    // end atomic section
}

Let's assume the two functions above are atomic (in the actual implementation they are, but the actual implementation is irrelevant for the question).

Each one of the threads are sharing some globally defined m and having the code similar to this:

extern Mutex m;
// .............
while (!AquireMutex(&m)) ;
// Do stuff
ReleaseMutex(&m);

The question is about the line:

while (!AquireMutex(&m)) ;

Will the AquireMutex actually be evaluated each iteration? Or the optimizer will just consider it as a constant since it won't see how m is changed? Should the AquireMutex be declared with a volatile qualifier instead:

bool AquireMutex(volatile Mutex* pMutex);
Eugene Sh.
  • 17,802
  • 8
  • 40
  • 61

1 Answers1

3

The answer depends on whether the implementation of these functions is visible to the call site with the while loop. If it isn't (the call site only sees the declarations, definitions are in separate source files), then the volatile keyword will change nothing. Optimizer has absolutely no idea what this function does with the argument, whether it has side effects and so on, so each function call will be made.

On the other hand, if the functions to release and acquire the mutex are inline - so the complete implementation is visible on the call site - then indeed optimizer may "tweak" things up a little. The problem is in the "complete" word - even if the code you posted would be inline, the code which starts and ends critical section is probably not. And even if it is, it may use assembly statements which optimizer wouldn't understand. Even if it's pure C, then it probably accesses some volatile memory-mapped registers. Any such piece of code (calling external function, assembly statements, accessing volatile memory) effectively prohibits optimizer from eliminating all the calls, as in that case it must assume that each call has side effects.

Freddie Chopin
  • 8,440
  • 2
  • 28
  • 58
  • Yes, this makes sense. I guess the optimizers can't work across the translation units these days... – Eugene Sh. Feb 27 '17 at 17:40
  • @EugeneSh. - unless you use LTO (; – Freddie Chopin Feb 27 '17 at 17:49
  • Which stands for... ? – Eugene Sh. Feb 27 '17 at 17:50
  • 1
    @EugeneSh. - it's Link Time Optimization, which is supposedly able to do optimizations across translation units. But don't worry about this particular thing - if there is any operation with side effects (for example the critical section code with assembly instructions or accessing volatile memory-mapped registers), then you're still safe. – Freddie Chopin Feb 27 '17 at 17:53
  • Cool.. looks like my compiler has to enable it explicitly and it is not.. but good to know – Eugene Sh. Feb 27 '17 at 18:00