-1

I just try to developing a VOIP application,

  • the audio buffer which fetch from RecordingCallBack would be wrapped to a NSData and then send to the remote-side by GCDAsyncSocket

  • and the remote-side would get the NSData, unwrapped to an audio
    buffer, and then the PlayingCallBack will fetch the audio buffer.

my plan is working so far, running fine on local ( the socket send data to local, and play the buffer local )

but when it running on two devices ( one real iphone-4s, one simulator ) the voice would became stranger, sounds like robotic sound

is there anyway to avoid the robotic sound effect ?

Here is my AudioUnit Settings:

#pragma mark - Init Methods

- (void)initAudioUint
{
    OSStatus status;

    // Describe audio component
    AudioComponentDescription desc;
    desc.componentType = kAudioUnitType_Output;
    desc.componentSubType = kAudioUnitSubType_RemoteIO;
    desc.componentFlags = 0;
    desc.componentFlagsMask = 0;
    desc.componentManufacturer = kAudioUnitManufacturer_Apple;

    // Get component
    AudioComponent inputComponent = AudioComponentFindNext(NULL, &desc);

    // Get audio units
    status = AudioComponentInstanceNew(inputComponent, &audioUnit);
    checkStatus(status);

    // Enable IO for recording
    UInt32 flag = 1;
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioOutputUnitProperty_EnableIO,
                                  kAudioUnitScope_Input,
                                  kInputBus,
                                  &flag,
                                  sizeof(flag));
    checkStatus(status);

    // Enable IO for playback
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioOutputUnitProperty_EnableIO,
                                  kAudioUnitScope_Output,
                                  kOutputBus,
                                  &flag,
                                  sizeof(flag));
    checkStatus(status);

    // Describe format
    AudioStreamBasicDescription audioFormat;
    audioFormat.mSampleRate = 44100.0f; // FS
    audioFormat.mFormatID = kAudioFormatLinearPCM;
    audioFormat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
    audioFormat.mChannelsPerFrame = 1; // stereo output
    audioFormat.mFramesPerPacket = 1;
    audioFormat.mBitsPerChannel = sizeof(short) * 8; // 16-bit
    audioFormat.mBytesPerFrame = audioFormat.mBitsPerChannel / 8 * audioFormat.mChannelsPerFrame;
    audioFormat.mBytesPerPacket = audioFormat.mBytesPerFrame * audioFormat.mFramesPerPacket;

    // Apply format
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioUnitProperty_StreamFormat,
                                  kAudioUnitScope_Output,
                                  kInputBus,
                                  &audioFormat,
                                  sizeof(audioFormat));
    checkStatus(status);
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioUnitProperty_StreamFormat,
                                  kAudioUnitScope_Input,
                                  kOutputBus,
                                  &audioFormat,
                                  sizeof(audioFormat));
    checkStatus(status);


    // Set input callback
    AURenderCallbackStruct callbackStruct;
    callbackStruct.inputProc = recordingCallback;
    callbackStruct.inputProcRefCon = (__bridge void*)self;
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioOutputUnitProperty_SetInputCallback,
                                  kAudioUnitScope_Global,
                                  kInputBus,
                                  &callbackStruct,
                                  sizeof(callbackStruct));
    checkStatus(status);


    // Set output callback
    callbackStruct.inputProc = playbackCallback;
    callbackStruct.inputProcRefCon = (__bridge void*)self;
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioUnitProperty_SetRenderCallback,
                                  kAudioUnitScope_Global,
                                  kOutputBus,
                                  &callbackStruct,
                                  sizeof(callbackStruct));
    checkStatus(status);


    /*
    // Disable buffer allocation for the recorder (optional - do this if we want to pass in our own)
    flag = 0;
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioUnitProperty_ShouldAllocateBuffer,
                                  kAudioUnitScope_Output,
                                  kInputBus,
                                  &flag,
                                  sizeof(flag));

    // Allocate our own buffers (1 channel, 16 bits per sample, thus 16 bits per frame, thus 2 bytes per frame).
    // Practice learns the buffers used contain 512 frames, if this changes it will be fixed in processAudio.
    tempBuffer.mNumberChannels = 1;
    tempBuffer.mDataByteSize = 512 * 2;
    tempBuffer.mData = malloc( 512 * 2 );
    checkStatus(status);


    // Disable buffer allocation for the recorder (optional - do this if we want to pass in our own)
    flag = 0;
    status = AudioUnitSetProperty(audioUnit,
                                  kAudioUnitProperty_ShouldAllocateBuffer,
                                  kAudioUnitScope_Output,
                                  kInputBus,
                                  &flag,
                                  sizeof(flag));

    // TODO: Allocate our own buffers if we want
    */

    // Initialise
    status = AudioUnitInitialize(audioUnit);
    checkStatus(status);

    conversionBuffer = (SInt16 *) malloc(1024 * sizeof(SInt16));
}

BTW, is there any way to set the audioFormat.mFramesPerPacket > 1 ?

in my case, it would print error, if the param > 1.

I was thinking about send a buffer which contain multi-frames (for fetch more time to play on the remote-side), it should be better than send one frame one packet for VOIP ?

PatrickSCLin
  • 1,419
  • 3
  • 17
  • 45

2 Answers2

0

Since the audio sample rate clocks of the two devices won't be perfectly synchronized, you will have to handle buffer underflow and overflow due to slight sample rate mismatches, as well as network latency jitter.

Also note that the buffer size sent to the RemoteIO callback may not stay constant, so the two callbacks will have to be able to handle buffer size mismatches.

hotpaw2
  • 70,107
  • 14
  • 90
  • 153
0

I just resolved this problem now!

need to set up the property of audio session, make sure two device has same BufferDuration

    // set preferred buffer size
    Float32 audioBufferSize = (set up the duration);
    UInt32 size = sizeof(audioBufferSize);
    result = AudioSessionSetProperty(kAudioSessionProperty_PreferredHardwareIOBufferDuration,
                           size, &audioBufferSize);
PatrickSCLin
  • 1,419
  • 3
  • 17
  • 45