Running Swift URL Session Upload Task and Microphone Streaming in Parallel

Question

I just started learning Swift and recently I have been trying to build a module(Library) that does the following on iOS/MacOS:

It reads the microphone and save the acoustic signal to a buffer.
In parallel, initialize a HTTP post request, connect to a remote HTTP server, and stream the content of the buffer in the last step to the server.
Also, the module listens to the servers the response and does the final processing.

I have looked online, and found some resources: For 1, the top answer of this post(How to get real-time microphone input in macOS?) serves my need. For 2 and 3, according to this post(https://developer.apple.com/documentation/foundation/url_loading_system/uploading_streams_of_data, it looks like a combination of UrlsessionUploadTask, URLSessionTaskDelegate, URLSessionDataDelegate, StreamDelegate is the right way to go. So, in order to do prototyping quickly, I constructed the following code on XCode Playground(MacOS):

First I made a simple Queue for holding NsData generated from microphone and treat it as a buffer.

       struct Queue<T> {
        var items = [T]()
    
        mutating func enQueue(item: T) {
            items.append(item)
        }
    
        mutating func deQueue() -> T? {
            return items.removeFirst()
        }
    
        func isEmpty() -> Bool {
            return items.isEmpty
        }
    
        func peek() -> T? {
            return items.first
        }
      }

Then, I create an AndroidInput class, in which there are two main functions:

startWriting(OutputStream), this will be called once we have an outputstream ready from our networking class.
setupAudioTask(), in which we set up the microphone input and store the data into a buffer.

class AudioInput {
    let audioEngine = AVAudioEngine()
    var buffer: AVAudioPCMBuffer? = nil
    var queue: Queue<Data> = Queue<Data>()
    var shouldWrite = true
    let concurrentQueue = DispatchQueue(label: "com.test.currentQueue")
    init(){
        self.setupAudio()
    }
    func startWriting(outputStream: OutputStream) {
        concurrentQueue.async {
            while(true) {
                if (!self.queue.isEmpty()) {
               // print("not empty, write!")
                    var result = self.queue.deQueue()
                result!.withUnsafeBytes{(bytes: UnsafePointer<UInt8>)->Void in
                    outputStream.write(bytes, maxLength: 4800)
                }
            }
        }
        }
    }
    
        func setupAudio() {
            let inputNode = audioEngine.inputNode
            let srate = inputNode.inputFormat(forBus: 0).sampleRate
            //print("sample rate = \(srate)")
            if srate == 0 {
                exit(0);
            }
            let recordingFormat = inputNode.outputFormat(forBus: 0)
                inputNode.installTap(onBus: 0,
                bufferSize: 1024,
                format: recordingFormat) {
                    (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
                    let n = buffer.frameLength
                    let c = buffer.stride
                    let channels = UnsafeBufferPointer(start: buffer.floatChannelData, count: Int(buffer.format.channelCount))
                    let data = NSData(bytes: channels[0], length:Int(buffer.frameCapacity * buffer.format.streamDescription.pointee.mBytesPerFrame))
                    self.queue.enQueue(item: Data(referencing: data))
                    //print( "buffer size  = \(self.queue.items.count)")
                    }
                        
    
            try! audioEngine.start()
        }
        func stopAudio() {
            audioEngine.stop()
        }
       }

Finally, I made the CoreModule class, which extends URLSessionTaskDelegate, URLSessionDataDelegate, and StreamDelegate. The most important function is setupOutputStream(), in which we create a url upload task and pass in self as task delegate. We also have 3 different urlSession() function that handles different 'callback events', like outputstream ready event, and server response received events.

        class CoreModule: NSObject, URLSessionTaskDelegate, URLSessionDataDelegate, StreamDelegate {
        lazy var boundStreams: Streams = {
            var inputOrNil: InputStream? = nil
            var outputOrNil: OutputStream? = nil
            Stream.getBoundStreams(withBufferSize: 4096,
                                   inputStream: &inputOrNil,
                                   outputStream: &outputOrNil)
            guard let input = inputOrNil, let output = outputOrNil else {
                fatalError("On return of `getBoundStreams`, both `inputStream` and `outputStream` will contain non-nil streams.")
            }
            // configure and open output stream
            output.delegate = self
            output.schedule(in: .current, forMode: .default)
            output.open()
            return Streams(input: input, output: output)
        }()
        lazy var session: URLSession = URLSession(configuration: .default,
                                                  delegate: self,
                                                  delegateQueue: .main)
        init() {}
        
        func setupOutputStream(audioInput: AudioInput, url: String){
            currentAudioInput = audioInput
            let url = URL(string: url)!
            var request = URLRequest(url: url,
                                     cachePolicy: .reloadIgnoringLocalCacheData,
                                     timeoutInterval: 10)
            // Add required headers
            request.addValue("test", "test")
            print("\(request.allHTTPHeaderFields)")
            request.httpMethod = "POST"
            let uploadTask = session.uploadTask(withStreamedRequest: request)
            uploadTask.resume()
        }
        
        func deactivate() {
            boundStreams.output.close()
            boundStreams.input.close()
        }
        
        func urlSession(_ session: URLSession, task: URLSessionTask,
                        needNewBodyStream completionHandler: @escaping (InputStream?) -> Void) {
            completionHandler(boundStreams.input)
        }
        
        func urlSession(_ session: URLSession, dataTask: URLSessionDataTask, didReceive data: Data) {
            print(String(data: data, encoding: .utf8))
        }
        
        func urlSession(_ session: URLSession, dataTask: URLSessionDataTask, didReceive response: URLResponse, completionHandler: @escaping (URLSession.ResponseDisposition) -> Void) {
                print("did receive response")
                completionHandler(URLSession.ResponseDisposition.allow)
        }
        
        func stream(_ aStream: Stream, handle eventCode: Stream.Event) {
            guard aStream == boundStreams.output else {
                return
            }
            if eventCode.contains(.hasSpaceAvailable) {
                print("start streaming!")
                if currentAudioInput != nil {
                    currentAudioInput?.startWriting(outputStream: boundStreams.output)
                } else {
                    print("input null!")
                }
            }
            if eventCode.contains(.errorOccurred) {
                print("error!")
            }
        }
       }

Finally, I wire everything up as follows:

    var testAudioInput = AudioInput()
    var coreModule = CoreModule()
    coreModule.setUpOutputStream(testAudioInput, "https://test.url.com")

And run the Playground file. I observed the following from the output and quite confused:

"start streaming!" is printed multiple times. Does it mean that the outputstream will be on and off during this http request, so I cannot start a thread in this function and indefinitely write to the Outputstream? If this is the case, how do I handle the logic here?
From HTTP server, I got a timeout response even when the outputstream is still open. How can this happen?

Another more general question is that, is my method the correct one? I would like this module to be able to run on both iOS and MacOS. Thanks in advance for your help!

Where is this function called: `stream(_ aStream: Stream, handle eventCode: Stream.Event)` ? Regarding timeouts, are you sure that is related to your output stream ? Timeouts usually occur when the server has taken too long to respond to your request and so the session times out and stops waiting for a response. — Shawn Frank, Mar 10 '22 at 12:40
Thanks for your help. Although I am not 100% sure, according to the apple dev document:"Write data to an output stream only when the stream is ready for it. You get notified of the stream’s readiness in the StreamDelegate method stream(_:handle:). When this callback sends hasSpaceAvailable as its eventCode parameter, the stream is ready to accept more data." — Junfei Wang, Mar 10 '22 at 18:14

Running Swift URL Session Upload Task and Microphone Streaming in Parallel

0 Answers0