Activate voice processing on iOS (iPhone)

Question

I want to activate the echo cancellation (Voice Processing) feature on my iOS audio pipeline.
I read something about that, that I have to use the kAudioUnitSubType_VoiceProcessingIO subtype.

My VoIP app uses 2 AudioUnits, one unit for the mic side and another unit for the speaker side. So in a full-duplex audio call I use at the moment 2 different audio units (I'm not sure, if that is allowed for voice processiong on iOS).

After I set this argument in my AudioUnit, the echo cancellation seems to work, but the audio quality is not pretty nice. It's difficult to describe, but I have some background noise in the signal.

What I have to do, to optimize this and to remove this noise from my signal? Here is my code for the setup of my audio engines. I will post not all of the code, because it's a lot, instead only the pieces, who I think it's relevant, if not please let me know.

Audio Session (audio format: PCM Int16, Sample Rate: 16000, 1 Channel):

 do {
           let session = AVAudioSession.sharedInstance()
           try session.setPreferredSampleRate(16000)
           try session.setPreferredIOBufferDuration(0.02)
            
           try session.setCategory(.playAndRecord)          
           try session.setActive(true)
            
        } catch let error {
            Logger.log("Error while setup AVAudioSession: \(error)", type: .error)
        }

First the Recorder (AudioUnit) side:

var componentDesc = AudioComponentDescription(
            componentType: kAudioUnitType_Output,
            componentSubType: kAudioUnitSubType_VoiceProcessingIO,
            componentManufacturer: kAudioUnitManufacturer_Apple,
            componentFlags: 0, componentFlagsMask: 0)

var streamFormatDesc = AudioStreamBasicDescription(
            mSampleRate: 16000,
            mFormatID: kAudioFormatLinearPCM,
            mFormatFlags: kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked | kAudioFormatFlagsNativeEndian | kAudioFormatFlagIsNonInterleaved,
            mBytesPerPacket: 2,
            mFramesPerPacket: 1,
            mBytesPerFrame: 2,
            mChannelsPerFrame: 1,
            mBitsPerChannel: 16,
            mReserved: UInt32(0))

Here is the playback engine (AuAudioUnit):

do {
            let audioComponentDescription = AudioComponentDescription(
                componentType: kAudioUnitType_Output,
                componentSubType: kAudioUnitSubType_VoiceProcessingIO,
                componentManufacturer: kAudioUnitManufacturer_Apple,
                componentFlags: 0, componentFlagsMask: 0)
            
            if auAudioUnit == nil {
                try auAudioUnit = AUAudioUnit(componentDescription: audioComponentDescription)
                try auAudioUnit.inputBusses[0].setFormat(16000)
                
                auAudioUnit.outputProvider = { (_, _, frameCount, _, inputDataList) -> AUAudioUnitStatus in
                    self.fillSpeakerBuffer(inputDataList: inputDataList, frameCount: Int(frameCount))
                    return(0)
                }
            }
            auAudioUnit.isOutputEnabled = true
            
            try auAudioUnit.allocateRenderResources()  
            try auAudioUnit.startHardware()

Activate voice processing on iOS (iPhone)

0 Answers0