I just started learning Swift and recently I have been trying to build a module(Library) that does the following on iOS/MacOS:
- It reads the microphone and save the acoustic signal to a buffer.
- In parallel, initialize a HTTP post request, connect to a remote HTTP server, and stream the content of the buffer in the last step to the server.
- Also, the module listens to the servers the response and does the final processing.
I have looked online, and found some resources: For 1, the top answer of this post(How to get real-time microphone input in macOS?) serves my need. For 2 and 3, according to this post(https://developer.apple.com/documentation/foundation/url_loading_system/uploading_streams_of_data, it looks like a combination of UrlsessionUploadTask, URLSessionTaskDelegate, URLSessionDataDelegate, StreamDelegate is the right way to go. So, in order to do prototyping quickly, I constructed the following code on XCode Playground(MacOS):
First I made a simple Queue for holding NsData generated from microphone and treat it as a buffer.
struct Queue<T> {
var items = [T]()
mutating func enQueue(item: T) {
items.append(item)
}
mutating func deQueue() -> T? {
return items.removeFirst()
}
func isEmpty() -> Bool {
return items.isEmpty
}
func peek() -> T? {
return items.first
}
}
Then, I create an AndroidInput class, in which there are two main functions:
- startWriting(OutputStream), this will be called once we have an outputstream ready from our networking class.
- setupAudioTask(), in which we set up the microphone input and store the data into a buffer.
class AudioInput {
let audioEngine = AVAudioEngine()
var buffer: AVAudioPCMBuffer? = nil
var queue: Queue<Data> = Queue<Data>()
var shouldWrite = true
let concurrentQueue = DispatchQueue(label: "com.test.currentQueue")
init(){
self.setupAudio()
}
func startWriting(outputStream: OutputStream) {
concurrentQueue.async {
while(true) {
if (!self.queue.isEmpty()) {
// print("not empty, write!")
var result = self.queue.deQueue()
result!.withUnsafeBytes{(bytes: UnsafePointer<UInt8>)->Void in
outputStream.write(bytes, maxLength: 4800)
}
}
}
}
}
func setupAudio() {
let inputNode = audioEngine.inputNode
let srate = inputNode.inputFormat(forBus: 0).sampleRate
//print("sample rate = \(srate)")
if srate == 0 {
exit(0);
}
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0,
bufferSize: 1024,
format: recordingFormat) {
(buffer: AVAudioPCMBuffer, when: AVAudioTime) in
let n = buffer.frameLength
let c = buffer.stride
let channels = UnsafeBufferPointer(start: buffer.floatChannelData, count: Int(buffer.format.channelCount))
let data = NSData(bytes: channels[0], length:Int(buffer.frameCapacity * buffer.format.streamDescription.pointee.mBytesPerFrame))
self.queue.enQueue(item: Data(referencing: data))
//print( "buffer size = \(self.queue.items.count)")
}
try! audioEngine.start()
}
func stopAudio() {
audioEngine.stop()
}
}
Finally, I made the CoreModule class, which extends URLSessionTaskDelegate, URLSessionDataDelegate, and StreamDelegate. The most important function is setupOutputStream(), in which we create a url upload task and pass in self as task delegate. We also have 3 different urlSession() function that handles different 'callback events', like outputstream ready event, and server response received events.
class CoreModule: NSObject, URLSessionTaskDelegate, URLSessionDataDelegate, StreamDelegate {
lazy var boundStreams: Streams = {
var inputOrNil: InputStream? = nil
var outputOrNil: OutputStream? = nil
Stream.getBoundStreams(withBufferSize: 4096,
inputStream: &inputOrNil,
outputStream: &outputOrNil)
guard let input = inputOrNil, let output = outputOrNil else {
fatalError("On return of `getBoundStreams`, both `inputStream` and `outputStream` will contain non-nil streams.")
}
// configure and open output stream
output.delegate = self
output.schedule(in: .current, forMode: .default)
output.open()
return Streams(input: input, output: output)
}()
lazy var session: URLSession = URLSession(configuration: .default,
delegate: self,
delegateQueue: .main)
init() {}
func setupOutputStream(audioInput: AudioInput, url: String){
currentAudioInput = audioInput
let url = URL(string: url)!
var request = URLRequest(url: url,
cachePolicy: .reloadIgnoringLocalCacheData,
timeoutInterval: 10)
// Add required headers
request.addValue("test", "test")
print("\(request.allHTTPHeaderFields)")
request.httpMethod = "POST"
let uploadTask = session.uploadTask(withStreamedRequest: request)
uploadTask.resume()
}
func deactivate() {
boundStreams.output.close()
boundStreams.input.close()
}
func urlSession(_ session: URLSession, task: URLSessionTask,
needNewBodyStream completionHandler: @escaping (InputStream?) -> Void) {
completionHandler(boundStreams.input)
}
func urlSession(_ session: URLSession, dataTask: URLSessionDataTask, didReceive data: Data) {
print(String(data: data, encoding: .utf8))
}
func urlSession(_ session: URLSession, dataTask: URLSessionDataTask, didReceive response: URLResponse, completionHandler: @escaping (URLSession.ResponseDisposition) -> Void) {
print("did receive response")
completionHandler(URLSession.ResponseDisposition.allow)
}
func stream(_ aStream: Stream, handle eventCode: Stream.Event) {
guard aStream == boundStreams.output else {
return
}
if eventCode.contains(.hasSpaceAvailable) {
print("start streaming!")
if currentAudioInput != nil {
currentAudioInput?.startWriting(outputStream: boundStreams.output)
} else {
print("input null!")
}
}
if eventCode.contains(.errorOccurred) {
print("error!")
}
}
}
Finally, I wire everything up as follows:
var testAudioInput = AudioInput()
var coreModule = CoreModule()
coreModule.setUpOutputStream(testAudioInput, "https://test.url.com")
And run the Playground file. I observed the following from the output and quite confused:
- "start streaming!" is printed multiple times. Does it mean that the outputstream will be on and off during this http request, so I cannot start a thread in this function and indefinitely write to the Outputstream? If this is the case, how do I handle the logic here?
- From HTTP server, I got a timeout response even when the outputstream is still open. How can this happen?
Another more general question is that, is my method the correct one? I would like this module to be able to run on both iOS and MacOS. Thanks in advance for your help!