How to convert 2 mono files into a single stereo file in iOS?

Question

I'm trying to convert 2 CAF files locally into a single file. These 2 CAF files are mono streams, and ideally, I'd like for them to be a stereo file so that way I can have the mic from one channel and the speaker from another.

I originally started by using AVAssetTrack and AVMutableCompositionTracks, however I couldn't resolve the mixing. My merged file was a single mono stream that interleaved the two files. So I've opted to go the AVAudioEngine route.

From my understanding, I can pass in my two files as input nodes, attach them to a mixer, and have an output node that is able to obtain the stereo mix. The output file has a stereo layout however no audio data seems to be written to it as I can open it in Audacity and see the stereo layout. Placing a dipatch sephamore signal around the installTapOnBus call did not help much either. Any insight would be appreciated as CoreAudio has been a challenge to understand.

// obtain path of microphone and speaker files
NSString *micPath = [[NSBundle mainBundle] pathForResource:@"microphone" ofType:@"caf"];
NSString *spkPath = [[NSBundle mainBundle] pathForResource:@"speaker" ofType:@"caf"];
NSURL *micURL = [NSURL fileURLWithPath:micPath];
NSURL *spkURL = [NSURL fileURLWithPath:spkPath];

// create engine
AVAudioEngine *engine = [[AVAudioEngine alloc] init];

AVAudioFormat *stereoFormat = [[AVAudioFormat alloc] initStandardFormatWithSampleRate:16000 channels:2];

AVAudioMixerNode *mainMixer = engine.mainMixerNode;

// create audio files
AVAudioFile *audioFile1 = [[AVAudioFile alloc] initForReading:micURL error:nil];
AVAudioFile *audioFile2 = [[AVAudioFile alloc] initForReading:spkURL error:nil];

// create player input nodes
AVAudioPlayerNode *apNode1 = [[AVAudioPlayerNode alloc] init];
AVAudioPlayerNode *apNode2 = [[AVAudioPlayerNode alloc] init];

// attach nodes to the engine
[engine attachNode:apNode1];
[engine attachNode:apNode2];

// connect player nodes to engine's main mixer
stereoFormat = [mainMixer outputFormatForBus:0];
[engine connect:apNode1 to:mainMixer fromBus:0 toBus:0 format:audioFile1.processingFormat];
[engine connect:apNode2 to:mainMixer fromBus:0 toBus:1 format:audioFile2.processingFormat];
[engine connect:mainMixer to:engine.outputNode format:stereoFormat];

// start the engine
NSError *error = nil;
if(![engine startAndReturnError:&error]){
    NSLog(@"Engine failed to start.");
}

// create output file
NSString *mergedAudioFile = [[micPath stringByDeletingLastPathComponent] stringByAppendingPathComponent:@"merged.caf"];
[[NSFileManager defaultManager] removeItemAtPath:mergedAudioFile error:&error];
NSURL *mergedURL = [NSURL fileURLWithPath:mergedAudioFile];
AVAudioFile *outputFile = [[AVAudioFile alloc] initForWriting:mergedURL settings:[engine.inputNode inputFormatForBus:0].settings error:&error];

// write from buffer to output file
[mainMixer installTapOnBus:0 bufferSize:4096 format:[mainMixer outputFormatForBus:0] block:^(AVAudioPCMBuffer *buffer, AVAudioTime *when){
    NSError *error;
    BOOL success;
    NSLog(@"Writing");
    if((outputFile.length < audioFile1.length) || (outputFile.length < audioFile2.length)){
        success = [outputFile writeFromBuffer:buffer error:&error];
        NSCAssert(success, @"error writing buffer data to file, %@", [error localizedDescription]);
        if(error){
            NSLog(@"Error: %@", error);
        }
    }
    else{
        [mainMixer removeTapOnBus:0];
        NSLog(@"Done writing");
    }
}];

}

Are you holding a strong reference to the AVAudioFile you're writing to? — dave234, Feb 14 '17 at 23:00
@Dave,the outputFile does not exist prior to being written to. In terms of strong reference, I'm setting that audioFile to write to the mergedURL, which is the fileURLWithPath of mergedAudioFile. There are no other objects/variables referencing outputFile, and I'm not destroying it after the installTapOnBus call. — A21, Feb 15 '17 at 03:47
One weakness of this approach is that you would have to wait for the duration of the files for them to be rendered into one. That being said, if you do stick with AVAudioEngine, you might try getting both files to play first. Then once that step is complete, install the tap and write to file. But if I were to do it myself I would use the C APIs. — dave234, Feb 15 '17 at 05:12
I'm actually not trying to get the files to play on the phone itself. I just want an outputFile to contain the stereo data, and if need be, to play it in Audacity. Would a dispatch_sephamore wrapped around that call help? I'll give that a shot again. I understand that if I was to use C, I'd have to manipulate the buffers themselves. Although I'm not sure how I can extract the buffer from the input audio files at the moment. I saw that I could utilize the answer to this - http://stackoverflow.com/questions/6292905/mono-to-stereo-conversion to get my output buffer, but I am concerned of header. — A21, Feb 15 '17 at 13:34
Since AVAudioEngine doesn't have a built in offline render feature, and you aren't tied to a specific API you should ask a more general question. Something like "How to convert two mono files into one stereo file in OS X or iOS". — dave234, Feb 15 '17 at 19:56
That's a good point! I've re-titled my question to address that. In the meantime I'm going to see if I can get this to work reading and writing using ExtAudioFile. — A21, Feb 15 '17 at 20:19
I tried it using the output of my AVMutableComposition and AudioConverterServices utilizing ExtAudioFile, but I ended up with a stereo file with both original input files interleaved in both channels. So I got a mono to stereo conversion, but not the exact output hoped for. However, I'm looking at your answer right now, and I think beginning with 3 buffers and reading each audio file to the correct "half" is the proper approach. Will let you know how it works out for me. Thanks for taking a look at this! Much appreciated. — A21, Feb 16 '17 at 14:44

score 3 · Accepted Answer · edited Feb 17 '17 at 01:26

Doing this with ExtAudioFile involves three files, and three buffers. Two mono for reading, and one stereo for writing. In a loop, each mono file will read small a segment of audio to its mono output buffer, then copied into the correct "half" of the stereo buffer. Then with the stereo buffer full of data, write that buffer it to the output file, repeat until both mono files have finished reading (writing zeroes if one mono file is longer than the other).

The most problematic area for me is getting the file formats right, core-audio wants very specific formats. Luckily, AVAudioFormat exists to simplify the creation of some common formats.

Each audio file reader/writer has two formats, one that represents the format that the data is stored in (file_format), and one that dictates the format that comes in/out of the the reader/writer (client_format). There are format converters built in to the reader/writers in case the formats are different.

Here's an example:

-(void)soTest{


    //This is what format the readers will output
    AVAudioFormat *monoClienFormat = [[AVAudioFormat alloc]initWithCommonFormat:AVAudioPCMFormatInt16 sampleRate:44100.0 channels:1 interleaved:0];

    //This is the format the writer will take as input
    AVAudioFormat *stereoClientFormat = [[AVAudioFormat alloc]initWithCommonFormat:AVAudioPCMFormatInt16 sampleRate:44100 channels:2 interleaved:0];

    //This is the format that will be written to storage.  It must be interleaved.
    AVAudioFormat *stereoFileFormat = [[AVAudioFormat alloc]initWithCommonFormat:AVAudioPCMFormatInt16 sampleRate:44100 channels:2 interleaved:1];




    NSURL *leftURL = [NSBundle.mainBundle URLForResource:@"left" withExtension:@"wav"];
    NSURL *rightURL = [NSBundle.mainBundle URLForResource:@"right" withExtension:@"wav"];

    NSString *stereoPath = [documentsDir() stringByAppendingPathComponent:@"stereo.wav"];
    NSURL *stereoURL = [NSURL URLWithString:stereoPath];

    ExtAudioFileRef leftReader;
    ExtAudioFileRef rightReader;
    ExtAudioFileRef stereoWriter;


    OSStatus status = 0;

    //Create readers and writer
    status = ExtAudioFileOpenURL((__bridge CFURLRef)leftURL, &leftReader);
    if(status)printf("error %i",status);//All the ExtAudioFile functins return a non-zero status if there's an error, I'm only checking one to demonstrate, but you should be checking all the ExtAudioFile function returns.
    ExtAudioFileOpenURL((__bridge CFURLRef)rightURL, &rightReader);
    //Here the file format is set to stereo interleaved.
    ExtAudioFileCreateWithURL((__bridge CFURLRef)stereoURL, kAudioFileCAFType, stereoFileFormat.streamDescription, nil, kAudioFileFlags_EraseFile, &stereoWriter);


    //Set client format for readers and writer
    ExtAudioFileSetProperty(leftReader, kExtAudioFileProperty_ClientDataFormat, sizeof(AudioStreamBasicDescription), monoClienFormat.streamDescription);
    ExtAudioFileSetProperty(rightReader, kExtAudioFileProperty_ClientDataFormat, sizeof(AudioStreamBasicDescription), monoClienFormat.streamDescription);
    ExtAudioFileSetProperty(stereoWriter, kExtAudioFileProperty_ClientDataFormat, sizeof(AudioStreamBasicDescription), stereoClientFormat.streamDescription);


    int framesPerRead = 4096;
    int bufferSize = framesPerRead * sizeof(SInt16);

    //Allocate memory for the buffers
    AudioBufferList *leftBuffer = createBufferList(bufferSize,1);
    AudioBufferList *rightBuffer = createBufferList(bufferSize,1);
    AudioBufferList *stereoBuffer = createBufferList(bufferSize,2);

    //ExtAudioFileRead takes an ioNumberFrames argument.  On input the number of frames you want, on otput it's the number of frames you got.  0 means your done.
    UInt32 leftFramesIO = framesPerRead;
    UInt32 rightFramesIO = framesPerRead;



    while (leftFramesIO || rightFramesIO) {
        if (leftFramesIO){
            //If frames to read is less than a full buffer, zero out the remainder of the buffer
            int framesRemaining = framesPerRead - leftFramesIO;
            if (framesRemaining){
                memset(((SInt16 *)leftBuffer->mBuffers[0].mData) + framesRemaining, 0, sizeof(SInt16) * framesRemaining);
            }
            //Read into left buffer
            leftBuffer->mBuffers[0].mDataByteSize = leftFramesIO * sizeof(SInt16);
            ExtAudioFileRead(leftReader, &leftFramesIO, leftBuffer);
        }
        else{
            //set to zero if no more frames to read
            memset(leftBuffer->mBuffers[0].mData, 0, sizeof(SInt16) * framesPerRead);
        }

        if (rightFramesIO){
            int framesRemaining = framesPerRead - rightFramesIO;
            if (framesRemaining){
                memset(((SInt16 *)rightBuffer->mBuffers[0].mData) + framesRemaining, 0, sizeof(SInt16) * framesRemaining);
            }
            rightBuffer->mBuffers[0].mDataByteSize = rightFramesIO * sizeof(SInt16);
            ExtAudioFileRead(rightReader, &rightFramesIO, rightBuffer);
        }
        else{
            memset(rightBuffer->mBuffers[0].mData, 0, sizeof(SInt16) * framesPerRead);
        }


        UInt32 stereoFrames = MAX(leftFramesIO, rightFramesIO);

        //copy left to stereoLeft and right to stereoRight
        memcpy(stereoBuffer->mBuffers[0].mData, leftBuffer->mBuffers[0].mData, sizeof(SInt16) * stereoFrames);
        memcpy(stereoBuffer->mBuffers[1].mData, rightBuffer->mBuffers[0].mData, sizeof(SInt16) * stereoFrames);

        //write to file
        stereoBuffer->mBuffers[0].mDataByteSize = stereoFrames * sizeof(SInt16);
        stereoBuffer->mBuffers[1].mDataByteSize = stereoFrames * sizeof(SInt16);
        ExtAudioFileWrite(stereoWriter, stereoFrames, stereoBuffer);

    }

    ExtAudioFileDispose(leftReader);
    ExtAudioFileDispose(rightReader);
    ExtAudioFileDispose(stereoWriter);

    freeBufferList(leftBuffer);
    freeBufferList(rightBuffer);
    freeBufferList(stereoBuffer);

}

AudioBufferList *createBufferList(int bufferSize, int numberBuffers){
    assert(bufferSize > 0 && numberBuffers > 0);
    int bufferlistByteSize = sizeof(AudioBufferList);
    bufferlistByteSize += sizeof(AudioBuffer) * (numberBuffers - 1);
    AudioBufferList *bufferList = malloc(bufferlistByteSize);
    bufferList->mNumberBuffers = numberBuffers;
    for (int i = 0; i < numberBuffers; i++) {
        bufferList->mBuffers[i].mNumberChannels = 1;
        bufferList->mBuffers[i].mData = malloc(bufferSize);
    }
    return bufferList;
};
void freeBufferList(AudioBufferList *bufferList){
    for (int i = 0; i < bufferList->mNumberBuffers; i++) {
        free(bufferList->mBuffers[i].mData);
    }
    free(bufferList);
}
NSString *documentsDir(){
    static NSString *path = NULL;
    if(!path){
        path = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, 1).firstObject;
    }
    return path;
}

I am getting back a stereo file with no output for each channel. The input mono files are of CAF type but I wouldn't expect the formatting to deviate much. — A21, Feb 16 '17 at 16:18
Yup, noticed the issue is with EAF output file creation. The url I am passing in is of the extension - ".caf" compared to your ".wav". Gives me an OSStatus error of 1718449215, which refers to kAudioFormatUnsupportedDataFormatError. — A21, Feb 16 '17 at 18:20
Changing it to kAudioFormatLinearPCM also didn't happen to work even though that's the output format I specified before when I was able to produce the interleaved stereo file from the interleaved mono file. — A21, Feb 16 '17 at 18:25
It should work for both caf and wav. Make sure you're using the interleaved format (stereoFileFormat in example) ExtAudioFileCreateWithURL. It will fail with non-interleaved. — dave234, Feb 16 '17 at 18:35
Yup, I had mistakenly changed format for one of the AVAudioFormats. I only receive an error now when I'm writing to stereoWriter at the end. — A21, Feb 16 '17 at 18:44
Just keep checking your errors and trying stuff until it works. The example is solid. — dave234, Feb 16 '17 at 18:45

How to convert 2 mono files into a single stereo file in iOS?

1 Answers1