where to start with audio synthesis on iPhone

Question

I'd like to build a synthesizer for the iPhone. I understand that it's possible to use custom audio units for the iPhone. At first glance, this sounds promising, since there's lots and lots of Audio Unit programming resources available. However, using custom audio units on the iPhone seems a bit tricky ( see: http://lists.apple.com/archives/Coreaudio-api/2008/Nov/msg00262.html)

This seems like the sort of thing that loads of people must be doing, but a simple google search for "iphone audio synthesis" doesn't turn up anything along the lines of a nice and easy tutorial or recommended tool kit.

So, anyone here have experience synthesizing sound on the iPhone? Are custom audio units the way to go, or is there another, simpler approach I should consider?

On Mac OS the use of Audio Units is really convenient and provides a ton of functionality. Unfortunately there seems to be almost nothing implemented on the iPhone. There are a few basic AU's but nowhere near what's available on MacOS. I filed a bug a year ago, but no progress yet. — D.C., Jan 18 '10 at 02:47
I googled audio units iphone and found someone talking about using custom AU's on iphone, but it seems a little hairy. — morgancodes, Jan 26 '10 at 16:08

score 22 · Accepted Answer · edited Apr 23 '13 at 15:21

I'm also investigating this. I think the AudioQueue API is probably the way to go.

Here's as far as I got, seems to work okay.

File: BleepMachine.h

//
//  BleepMachine.h
//  WgHeroPrototype
//
//  Created by Andy Buchanan on 05/01/2010.
//  Copyright 2010 Andy Buchanan. All rights reserved.
//

#include <AudioToolbox/AudioToolbox.h>

// Class to implement sound playback using the AudioQueue API's
// Currently just supports playing two sine wave tones, one per
// stereo channel. The sound data is liitle-endian signed 16-bit @ 44.1KHz
//
class BleepMachine
{
    static void staticQueueCallback( void* userData, AudioQueueRef outAQ, AudioQueueBufferRef outBuffer )
    {
        BleepMachine* pThis = reinterpret_cast<BleepMachine*> ( userData );
        pThis->queueCallback( outAQ, outBuffer );
    }
    void queueCallback( AudioQueueRef outAQ, AudioQueueBufferRef outBuffer );

    AudioStreamBasicDescription m_outFormat;

    AudioQueueRef m_outAQ;

    enum 
    {
        kBufferSizeInFrames = 512,
        kNumBuffers = 4,
        kSampleRate = 44100,
    };

    AudioQueueBufferRef m_buffers[kNumBuffers];

    bool m_isInitialised;

    struct Wave 
    {
        Wave(): volume(1.f), phase(0.f), frequency(0.f), fStep(0.f) {}
        float   volume;
        float   phase;
        float   frequency;
        float   fStep;
    };

    enum 
    {
        kLeftWave = 0,
        kRightWave = 1,
        kNumWaves,
    };

    Wave m_waves[kNumWaves];

public:
    BleepMachine();
    ~BleepMachine();

    bool Initialise();
    void Shutdown();

    bool Start();
    bool Stop();

    bool SetWave( int id, float frequency, float volume );
};

// Notes by name. Integer value is number of semitones above A.
enum Note
{
    A       = 0,
    Asharp,
    B,
    C,
    Csharp,
    D,
    Dsharp,
    E,
    F,
    Fsharp,
    G,
    Gsharp,

    Bflat = Asharp,
    Dflat = Csharp,
    Eflat = Dsharp,
    Gflat = Fsharp,
    Aflat = Gsharp,
};

// Helper function calculates fundamental frequency for a given note
float CalculateFrequencyFromNote( SInt32 semiTones, SInt32 octave=4 );
float CalculateFrequencyFromMIDINote( SInt32 midiNoteNumber );

File:BleepMachine.mm

 //
//  BleepMachine.mm
//  WgHeroPrototype
//
//  Created by Andy Buchanan on 05/01/2010.
//  Copyright 2010 Andy Buchanan. All rights reserved.
//

#include "BleepMachine.h"

void BleepMachine::queueCallback( AudioQueueRef outAQ, AudioQueueBufferRef outBuffer )
{
    // Render the wave

    // AudioQueueBufferRef is considered "opaque", but it's a reference to
    // an AudioQueueBuffer which is not. 
    // All the samples manipulate this, so I'm not quite sure what they mean by opaque
    // saying....
    SInt16* coreAudioBuffer = (SInt16*)outBuffer->mAudioData;

    // Specify how many bytes we're providing
    outBuffer->mAudioDataByteSize = kBufferSizeInFrames * m_outFormat.mBytesPerFrame;

    // Generate the sine waves to Signed 16-Bit Stero interleaved ( Little Endian )
    float volumeL = m_waves[kLeftWave].volume;
    float volumeR = m_waves[kRightWave].volume;
    float phaseL = m_waves[kLeftWave].phase;
    float phaseR = m_waves[kRightWave].phase;
    float fStepL = m_waves[kLeftWave].fStep;
    float fStepR = m_waves[kRightWave].fStep;

    for( int s=0; s<kBufferSizeInFrames*2; s+=2 )
    {
        float sampleL = ( volumeL * sinf( phaseL ) );
        float sampleR = ( volumeR * sinf( phaseR ) );

        short sampleIL = (int)(sampleL * 32767.0);
        short sampleIR = (int)(sampleR * 32767.0);

        coreAudioBuffer[s] =   sampleIL;
        coreAudioBuffer[s+1] = sampleIR;

        phaseL += fStepL;
        phaseR += fStepR;
    }

    m_waves[kLeftWave].phase = fmodf( phaseL, 2 * M_PI );   // Take modulus to preserve precision
    m_waves[kRightWave].phase = fmodf( phaseR, 2 * M_PI );

    // Enqueue the buffer
    AudioQueueEnqueueBuffer( m_outAQ, outBuffer, 0, NULL ); 
}

bool BleepMachine::SetWave( int id, float frequency, float volume )
{
    if ( ( id < kLeftWave ) || ( id >= kNumWaves ) ) return false;

    Wave& wave = m_waves[ id ];

    wave.volume = volume;
    wave.frequency = frequency;
    wave.fStep = 2 * M_PI * frequency / kSampleRate;

    return true;
}

bool BleepMachine::Initialise()
{
    m_outFormat.mSampleRate = kSampleRate;
    m_outFormat.mFormatID = kAudioFormatLinearPCM;
    m_outFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
    m_outFormat.mFramesPerPacket = 1;
    m_outFormat.mChannelsPerFrame = 2;
    m_outFormat.mBytesPerPacket = m_outFormat.mBytesPerFrame = sizeof(UInt16) * 2;
    m_outFormat.mBitsPerChannel = 16;
    m_outFormat.mReserved = 0;

    OSStatus result = AudioQueueNewOutput(
                                          &m_outFormat,
                                          BleepMachine::staticQueueCallback,
                                          this,
                                          NULL,
                                          NULL,
                                          0,
                                          &m_outAQ
                                          );

    if ( result < 0 )
    {
        printf( "ERROR: %d\n", (int)result );
        return false;
    }

    // Allocate buffers for the audio
    UInt32 bufferSizeBytes = kBufferSizeInFrames * m_outFormat.mBytesPerFrame;

    for ( int buf=0; buf<kNumBuffers; buf++ ) 
    {
        OSStatus result = AudioQueueAllocateBuffer( m_outAQ, bufferSizeBytes, &m_buffers[ buf ] );
        if ( result )
        {
            printf( "ERROR: %d\n", (int)result );
            return false;
        }

        // Prime the buffers
        queueCallback( m_outAQ, m_buffers[ buf ] );
    }

    m_isInitialised = true;
    return true;
}

void BleepMachine::Shutdown()
{
    Stop();

    if ( m_outAQ )
    {
        // AudioQueueDispose also chucks any audio buffers it has
        AudioQueueDispose( m_outAQ, true );
    }

    m_isInitialised = false;
}

BleepMachine::BleepMachine()
: m_isInitialised(false), m_outAQ(0)
{
    for ( int buf=0; buf<kNumBuffers; buf++ ) 
    {
        m_buffers[ buf ] = NULL;
    }
}

BleepMachine::~BleepMachine()
{
    Shutdown();
}

bool BleepMachine::Start()
{
    OSStatus result = AudioQueueSetParameter( m_outAQ, kAudioQueueParam_Volume, 1.0 );
    if ( result ) printf( "ERROR: %d\n", (int)result );

    // Start the queue
    result = AudioQueueStart( m_outAQ, NULL );
    if ( result ) printf( "ERROR: %d\n", (int)result );

    return true;
}

bool BleepMachine::Stop()
{
    OSStatus result = AudioQueueStop( m_outAQ, true );
    if ( result ) printf( "ERROR: %d\n", (int)result );

    return true;
}

// A    (A4=440)
// A#   f(n)=2^(n/12) * r
// B    where n = number of semitones
// C    and r is the root frequency e.g. 440
// C#
// D    frq -> MIDI note number
// D#   p = 69 + 12 x log2(f/440)
// E
// F    
// F#
// G
// G#
//
// MIDI Note ref: http://www.phys.unsw.edu.au/jw/notes.html
//
// MIDI Node numbers:
// A3   57
// A#3  58
// B3   59
// C4   60 <--
// C#4  61
// D4   62
// D#4  63
// E4   64
// F4   65
// F#4  66
// G4   67
// G#4  68
// A4   69 <--
// A#4  70
// B4   71
// C5   72

float CalculateFrequencyFromNote( SInt32 semiTones, SInt32 octave )
{
    semiTones += ( 12 * (octave-4) );
    float root = 440.f;
    float fn = powf( 2.f, (float)semiTones/12.f ) * root;
    return fn;
}

float CalculateFrequencyFromMIDINote( SInt32 midiNoteNumber )
{
    SInt32 semiTones = midiNoteNumber - 69;
    return CalculateFrequencyFromNote( semiTones, 4 );
}

//for ( SInt32 midiNote=21; midiNote<=108; ++midiNote )
//{
//  printf( "MIDI Note %d: %f Hz \n",(int)midiNote,CalculateFrequencyFromMIDINote( midiNote ) );
//}

Update: Basic usage info

Initialise. Somehere near the start, I'm using initFromNib: in my code

m_bleepMachine = new BleepMachine;
m_bleepMachine->Initialise();
m_bleepMachine->Start();

Now the sound playback is running, but generating silence.
In your code, call this when you want to change the tone generation
```
m_bleepMachine->SetWave( ch, frq, vol );
```
- where ch is the channel ( 0 or 1 )
- where frq is the frequency to set in Hz
- where vol is the volume ( 0=-Inf db, 1=-0db )
At program termination
```
delete m_bleepMachine;
```

Thanks Andy. Much appreciated. I'm hoping I can find a toolkit that will handle some of this lower-level stuff for me. I'm coming from SuperCollider, and really like just being able to plug unit generators in to one another. But seeing an example of a lower-level implementation is great. Quick newbie question -- what's the .mm extension? Also, if you feel like pasting in an example usage of your BleepMachine class, I'd be super greatful. — morgancodes, Jan 15 '10 at 15:52

morgancodes · Answer 2 · 2014-02-08T15:49:26.973

Since my original post almost a year ago, I've come a long way. After a pretty exhaustive search, I came up with very few high-level synthesis tools suitable for iOS development. There are many which are GPL licensed, but the GPL license is too restrictive for me to feel comfortable using it. LibPD works great, and is what rjdj uses, but I found myself really frustrated by the graphical programming paradigm. JSyn's c-based engine, csyn, is an option, but it requires licensing, and I'm really used to programming with open-source tools. It does look worth a close look though.

In the end, I'm using STK as my basic framework. STK is a very low-level tool, and requires extensive buffer-level programming to get working. This is in contrast to something higher level like PD or SuperCollider, which allows you to simply plug unit generators together and not worry about handling the raw audio data.

Working this way with STK is certainly a bit slower than with a high level tool, but I'm becoming comfortable with it. Especially now that I'm becoming more comfortable with C/C++ programming in general.

There's a new project under way to create a patching-style add on to Open Frameworks. It's called Cleo I think, out of the University of Vancouver. It hasn't been released yet, but it looks like a very nice mix of patching-style connection of unit generators in C++ rather than requiring the use of another language. And it's tightly integrated with Open Frameworks, which may be appealing or not, depending.

So, to answer my original question, first you need to learn how to write to the output buffer. Here's some good sample code for that:

http://atastypixel.com/blog/using-remoteio-audio-unit/

Then you need to do some synthesis to generate the audio data. If you like patching, I wouldn't hesitate to recommend libpd. It seems to work great, and you can work the way you're accustomed to. If you hate graphical patching (like me), your best starting place for now is probably STK. If STK and low-level audio programming seems a bit over your head (like it was for me), just roll up your sleeves, pack a tent, and set up on a bit of a long hike up the learning curve. You'll be a much better programmer for it in the end.

Another bit of advice I wish I could have given myself a year ago: join Apple's Core Audio mailing list.

============== 2014 Edit ===========

I'm now using (and actively contributing to) the Tonic audio synthesis library. It's awesome, if I don't say so myself.

Is there any documentation about Tonic besides the README? I'd like to use it for an iOS app. Thanks for the work. — guardabrazo, Jun 17 '14 at 15:54

score 3 · Answer 3 · answered Feb 23 '11 at 05:32

With the enormous caveat that I have yet to get past all the documentation or finishing browsing some classes / sample code, it looks like the fine folks from CCRMA over at Stanford may have put some nice toolkits together for our audio hacking pleasure. No guarantees these will do exactly what you want, but based on what I know about the original STK, they should do the trick. I'm about to embark on an audio synth app myself and the more code I can reuse, the better.

Links / descriptions from their site...

MoMu : MoMu is a light-weight software toolkit for creating musical instruments and experiences on mobile device, and currently supports the iPhone platform (iPhone, iPad, iPod Touches). MoMu provides API's for real-time full-duplex audio, accelerometer, location, multi-touch, networking (via OpenSoundControl), graphics, and utilities. (yada yada)

• and •

MoMu STK : The MoMu release of the Synthesis Toolkit (STK, originally by Perry R. Cook and Gary P. Scavone) is a lightly modified version of STK 4.4.2, and currently supports the iPhone platform (iPhone, iPad, iPod Touches).

Thanks Eric. STK is what I wound up using for my app, Thicket. Not sure how much of a leg up MoMu will give you. I found it pretty easy to just use a few STK classes and write directly to the RemoteIO callback. Not totally sure what the point of Momu is. Most of the stuff it seems to offer is pretty easy to do directly via the Cocoa classes (with the acception of OSC). — morgancodes, Feb 24 '11 at 17:04
One little tip regarding stk: I found last night that I get around a 15% performance boost if I change StkFloat to use float instead of double. Also -- this will be obvious to anyone with signal processing experience, but wasn't to me -- it's much more efficient to have every method call process a buffer of samples rather than a single sample. — morgancodes, Feb 24 '11 at 17:05
+1 Good call Eric! Nice to come across your name on Stack Overflow. -montag — Matt Montag, Aug 08 '11 at 19:55

score 1 · Answer 4 · answered Mar 05 '14 at 20:39

I'm one of the other contributors to Tonic along with morgancodes. For wrangling CoreAudio in a higher-level framework, I can't give enough praise to The Amazing Audio Engine.

We've both used it in tandem with Tonic in a number of projects. It takes so much of the pain out of dealing with CoreAudio directly, letting you focus on the actual content and synthesis instead of the hardware abstraction layer.

score 1 · Answer 5 · edited Jun 20 '20 at 09:12

1

Lately I've been using AudioKit

It's a fresh and well designed wrapper over CSound which has been around for ages

I was using tonic with openframeworks and I was finding myself missing programming in swift.

Although tonic and openframeworks are both powerful tools,

I've chosen to get in bed with swift

edited Jun 20 '20 at 09:12

Community

1
1

answered Apr 03 '15 at 03:50

Paul Wand

323
5
12

Update: I went back to good old open frameworks after I wasn't getting the kind of sound I wanted /// – Paul Wand May 09 '15 at 21:49
Hi Paul, I'm actually one of the developers of AudioKit...sorry you weren't getting the kind of sound you were after. I'd love to hear more about what you're after with AudioKit...feel free to shoot me an email at nick (at) audiokit (dot) io – narner May 21 '15 at 13:35

score 1 · Answer 6 · answered Feb 26 '11 at 22:12

I'm just getting into Audio Unit programming for iPhone to build a synth-like app as well. The Apple guide "Audio Unit Hosting Guide for iOS" seems like a good reference:

http://developer.apple.com/library/ios/#documentation/MusicAudio/Conceptual/AudioUnitHostingGuide_iOS/AudioUnitHostingFundamentals/AudioUnitHostingFundamentals.html#//apple_ref/doc/uid/TP40009492-CH3-SW11

The guide includes links to a couple sample projects. Audio Mixer (MixerHost) and aurioTouch:

http://developer.apple.com/library/ios/samplecode/MixerHost/Introduction/Intro.html#//apple_ref/doc/uid/DTS40010210

http://developer.apple.com/library/ios/samplecode/aurioTouch/Introduction/Intro.html#//apple_ref/doc/uid/DTS40007770

Also, this guide is great: http://timbolstad.com/2010/03/14/core-audio-getting-started/ — Chris Livdahl, Feb 28 '11 at 00:24

Justin Smith · Answer 7 · 2010-01-20T14:33:51.700

0

PD has a version that runs on the iphone, used by RjDj. If you are OK with using someone else's app rather than writing your own, you can do quite a bit in an RjDj scene, and there is a set of objects that let you patch it out and test it on a regular PD on your own computer.

I should mention: PD is a visual dataflow programming language, that is to say, it is turing complete, and can be used to develop graphical applications - but if you are going to do anything interesting I would definitely look into best practices for patching.

edited Jan 20 '10 at 14:33

answered Jan 20 '10 at 14:27

Justin Smith

2,503
14
12

Thanks Justin, I definitely want to be able develop my own app and not be tied to RJDJ. I know SuperCollider is iPhone-able, but I think the SC license precludes using it in any sort of commercial software, so that's a bit of a turn-off. – morgancodes Jan 22 '10 at 17:45
I am actually a contributer to supercollider, and it has no such restriction. It is GPL, which means that if you modify the supercollider source, you need to share those modifications with anyone who gets the app (just as gcc is GPL, so any modifications you make to gcc to make your app run need to be shared), but if your code is using supercollider but not modifying supercollider itself there is no such restriction at all. That said, I am pretty sure you can only use the iphone version of SC on jailbroken phones at the moment, which is why I did not suggest it. – Justin Smith Jan 22 '10 at 18:27
If I recall correctly, the reason that sc is not availible through the app store is that it is a full fledged interpreter. By these standards RjDj should be excluded as well (PD is an unusual platform for programming, but it is a programming environment), but as far as I am concerned neither should be. – Justin Smith Jan 22 '10 at 18:59

tahome · Answer 8 · 2011-02-18T12:54:09.403

Last time I checked you couldn't use custom AUs on iOS in a way that would allow all installed apps to use it (like on MacOS X).

You could theoretically use a custom AU from inside your iOS app by loading it from the app's bundle and calling the AU's render function directly, but then you could as well add the code directly to your app. Also, I'm pretty sure that loading and calling code that sits in a dynamic library would go against the AppStore policies.

So you will either have to do the processing in your remote IO callback or use the Apple AUs that are preinstalled, within an AUGraph.

where to start with audio synthesis on iPhone

8 Answers8

Linked