2

Some JUCE users including me are running into an occasional deadlock when opening a CoreAudio device (it happens often enough to be a real problem).

What happens is that the main thread waits on a mutex during a call to AudioDeviceStart(). This is a mutex inside CoreAudio.

The audio callback thread also waits on a mutex to unlock, and in this case it’s an application specific mutex to synchronise audio callbacks with the main thread.

Since the audio callback will not return, the call to AudioDeviceStart doesn’t return and we have 2 threads waiting on 2 different locks which are held by the other thread.

After diving in a bit deeper I was able to get some insight:

  • If I re-open the device every second with a different sample rate, I can reproduce this problem well under 5 minutes consistently (see attached test application). Since locking doesn’t always happen, it looks like a race condition.
  • Only the Mac internal audio device seems to be affected (both speakers and external headphones).
  • Users have reported this problem on Apple Silicon and Intel based systems.
  • This problem only happens when opening a device with a sample rate not equal to the current sample rate of the device (so the device needs reconfiguration).
  • macOS version: 12.4 (Monterey)
  • MacBook Pro 16-inch, 2021, Apple M1 Max, 32GB

All this raises the question what is allowed in terms of synchronising audio callbacks with the main thread.

  • Are we chasing a bug in CoreAudio or are we violating rules?
  • Is there any specific documentation about synchronising audio callbacks?

To illustrate the problem I created a small CoreAudio-only test utility which resembles how JUCE uses the CoreAudio API:

#include <CoreAudio/AudioHardware.h>
#include <chrono>
#include <iostream>
#include <thread>

OSStatus audioIOProc (
    AudioObjectID inDevice,
    const AudioTimeStamp* inNow,
    const AudioBufferList* inInputData,
    const AudioTimeStamp* inInputTime,
    AudioBufferList* outOutputData,
    const AudioTimeStamp* inOutputTime,
    void* __nullable inClientData)
{
    if (!outOutputData)
        return noErr;

    if (!inClientData)
        return noErr;

    // Synchronise audio callbacks with the main thread, which is not ok during processing but fine when starting or
    // stopping a device.
    std::lock_guard lock (*static_cast<std::mutex*> (inClientData));

    // Silence the output buffers the quick and rough way.
    for (UInt32 i = 0; i < outOutputData->mNumberBuffers; ++i)
    {
        auto buffer = outOutputData->mBuffers[i];

        for (int j = 0; j < buffer.mDataByteSize; ++j)
        {
            static_cast<uint8_t*> (buffer.mData)[j] = 0;
        }
    }

    return noErr;
}

int main (int argc, const char* argv[])
{
    AudioObjectPropertyAddress property;
    property.mSelector = kAudioHardwarePropertyDefaultOutputDevice;
    property.mScope = kAudioObjectPropertyScopeGlobal;
    property.mElement = kAudioObjectPropertyElementMain;

    UInt32 propertySize = 0;
    if (AudioObjectGetPropertyDataSize (kAudioObjectSystemObject, &property, 0, nullptr, &propertySize) != noErr)
        throw std::runtime_error ("Failed to get kAudioObjectSystemObject");

    AudioObjectID defaultOutputDevice = 0;
    if (AudioObjectGetPropertyData (
            kAudioObjectSystemObject,
            &property,
            0,
            nullptr,
            &propertySize,
            &defaultOutputDevice))
        throw std::runtime_error ("Failed to get kAudioObjectSystemObject");

    AudioDeviceIOProcID procID = nullptr;

    std::mutex mutex;

    if (AudioDeviceCreateIOProcID (defaultOutputDevice, audioIOProc, &mutex, &procID) != noErr)
        throw std::runtime_error ("Failed to create procid");

    int counter = 0;
    bool sampleRateToggle = false;

    while (true)
    {
        property.mSelector = kAudioDevicePropertyNominalSampleRate;
        property.mScope = kAudioObjectPropertyScopeGlobal;
        property.mElement = kAudioObjectPropertyElementMaster;

        double sampleRate = sampleRateToggle ? 48000.0 : 44100.0;
        if (AudioObjectSetPropertyData (defaultOutputDevice, &property, 0, nullptr, sizeof sampleRate, &sampleRate) !=
            noErr)
            throw std::runtime_error ("Failed to set samplerate");

        sampleRateToggle = !sampleRateToggle;

        {
            std::lock_guard lock (mutex);
            if (AudioDeviceStart (defaultOutputDevice, procID) != noErr)
                throw std::runtime_error ("Failed to start device");
        }

        using namespace std::chrono_literals;
        std::this_thread::sleep_for (1000ms);

        std::cout << "Opened device (" << ++counter << ")" << std::endl;

        {
            std::lock_guard lock (mutex);
            if (AudioDeviceStop (defaultOutputDevice, nullptr) != noErr)
                throw std::runtime_error ("Failed to stop device");
        }
    }

    // We never reach this, but hey...

    if (AudioDeviceDestroyIOProcID (defaultOutputDevice, procID) != noErr)
        throw std::runtime_error ("Failed to destroy procid");

    return 0;
}

And this is the stack trace of when the deadlock happens:

com.apple.main-thread:

__psynch_mutexwait 0x000000019c1dd738
_pthread_mutex_firstfit_lock_wait 0x000000019c215384
_pthread_mutex_firstfit_lock_slow 0x000000019c212cf8
HALB_Mutex::Lock() 0x000000019e1ac898
HALC_ProxyIOContext::StopIOProc(int (*)(unsigned int, AudioTimeStamp const*, AudioBufferList const*, AudioTimeStamp const*, AudioBufferList*, AudioTimeStamp const*, void*)) 0x000000019ddb9838
HAL_HardwarePlugIn_DeviceStop(AudioHardwarePlugInInterface**, unsigned int, int (*)(unsigned int, AudioTimeStamp const*, AudioBufferList const*, AudioTimeStamp const*, AudioBufferList*, AudioTimeStamp const*, void*)) 0x000000019dd838bc
HALDevice::StopIOProc(int (*)(unsigned int, AudioTimeStamp const*, AudioBufferList const*, AudioTimeStamp const*, AudioBufferList*, AudioTimeStamp const*, void*)) 0x000000019e1a6028
AudioDeviceStop 0x000000019dbffdb8
main CoreAudioMain.cpp:94
start 0x000000010410908c

com.apple.audio.IOThread.client:

__psynch_mutexwait 0x000000019c1dd738
_pthread_mutex_firstfit_lock_wait 0x000000019c215384
_pthread_mutex_firstfit_lock_slow 0x000000019c212cf8
std::__1::mutex::lock() 0x000000019c1691a8
std::lock_guard::lock_guard(std::mutex &) __mutex_base:90
std::lock_guard::lock_guard(std::mutex &) __mutex_base:90
audioIOProc(unsigned int, const AudioTimeStamp *, const AudioBufferList *, const AudioTimeStamp *, AudioBufferList *, const AudioTimeStamp *, void *) CoreAudioMain.cpp:22
HALC_ProxyIOContext::IOWorkLoop() 0x000000019ddb7db4
invocation function for block in HALC_ProxyIOContext::HALC_ProxyIOContext(unsigned int, unsigned int) 0x000000019ddb5efc
HALB_IOThread::Entry(void*) 0x000000019df82304
_pthread_start 0x000000019c21826c

EDIT: Cross reference to the JUCE forum

Ruurd Adema
  • 920
  • 7
  • 17
  • `std::lock_guard lock (*static_cast (inClientData));` it burns my eyes. Will explain why later when will have more time (or maybe someone will be faster). – Marek R Aug 04 '22 at 11:39
  • 1
    Perhaps replace the `std::lock_guard lock (*static_cast (inClientData))` with a `std::mutex * m = static_cast (inClientData); if (m->try_lock()) { ... do stuff ...; m->unlock();}` instead; that could avoid the deadlock. – Jeremy Friesner Aug 04 '22 at 14:18
  • That is a great suggestion! That could probably be a simple fix which could be applied to JUCE. Thanks. – Ruurd Adema Aug 04 '22 at 17:42
  • @RuurdAdema what if the writeout to buffers is missed? If risking an audio glitch is fine, then OK, but is it? – alagner Aug 05 '22 at 07:11
  • It depends what you mean by audio glitch, in this case I can think of 2 potential problems. One is the case where we miss reading and writing a buffer when the locking failed, but which is fine I think because the mutex only gets locked when starting or stopping a device. The other case is about returning garbage data by not writing anything into the output buffer. That is never ok because it can blow up speakers or eardrums. But this can be easily managed by clearing the output buffers inside the try-locking-failed-branch inside the callback. What do you think? – Ruurd Adema Aug 05 '22 at 08:00
  • @MarekR Please define 'later' :-). I would be interested to hear your thoughts. – Ruurd Adema Oct 13 '22 at 09:58

0 Answers0