WebRTC running from WKWebView AVAudioSession development roadblock

Question

Over the past few years I have steadily developed a complete WebRTC based browser Phone using the SIP protocol. The main SIP toolbox is SIPJS (https://sipjs.com/), and it provides all the tools one needs to make and receive calls to a SIP based PBX of your own.

The Browser Phone project: https://github.com/InnovateAsterisk/Browser-Phone/ gives SIPJS it's full functionality and UI. You can simply navigate to the phone in a browser and start using it. Everything will works perfectly.

On Mobile

Apple finally allow WebRTC (getUserMedia()) on WKWebView, so it wasn't long before people started to ask how it would work on mobile. And while the UI is well suited for cellphones and tablets, just the UI isn't enough now days to be a full solution.

The main consideration is that a mobile app is typically one that has a short lifespan, in that you can't or don't leave it running in the background like you can or would with the Browser on a PC. This presents a few challenges to truly making the Browser Phone mobile friendly. iOS is going to want to shutdown the app as soon as its not the front most app - and rightly so. So there are tools for handling that, like Callkit & Push Notifications. This allows the app to be woken up, so that it can accept the call, and notify the user.

Just remember, this app is created by opening a UIViewController, adding a WKWebView, and navigating to the phone page. There is full communication between the app and the html & Javascript, so events can be passed back and forth.

WKWebView & AVAudioSession Issue:

After a LOT of reading unsolved forum posts, it's clear that AVAudioSession.sharedInstance() is simply not connected to the WKWebView, or there is some undocumented connection.

The result is that if the call starts from the app, and is sent to the background, the microphone is disabled. Clearly this isn't an option if you are on a call. Now, I can manage this limitation a little, by putting the call on hold when the app is sent to the background - although this would be confusing to the user and a poor user experience.

However, the real issue is that if the app was woken from Callkit, because the app never goes to the foreground (because Callkit is), the microphone isn't activated in the first place, and even if you do witch to the app, it doesn't activate even after that. This is simply an unacceptable user experience.

What I found interesting is that if you simply open up Safari Browser on iOS (15.x), and navigate to the phone page: https://www.innovateasterisk.com/phone/ (without making an app in xCode and loading it into a WKWebView), the microphone continues to work when the app is sent to the background. So how do Safari manage to do this? Of course this doesn't and can't salve the CallKit issue, but still interesting to see that Safari can make use of the microphone in the background, since Safari is built off WKWebView.

(I was reading about entitlements, and that this may have to be specially granted... im not sure how this works?)

The next problem with AVAudioSession is that since you cannot access the session for WkWebView, you cannot change the output of the <audio> element, so you cannot change it from say speaker to earpiece, or make it use a bluetooth device.

It simply wouldn't be feasible to redevelop the entire application using an outdated WebRTC SDK (Google no long maintain the WebRTC iOS SDK), and then build my own Swift SIP stack like SIPJS and land up with two sets of code to maintain... so my main questions are:

How can I access the AVAudioSession of WKWebView so that I can set the output path/device?
How can I have the microphone stay active when the app is sent to the background?
How can I activate the microphone when Callkit activates the application (while the application is in the background)?

I don't think there's anything you can do about point 1., except log bugs like this (sorry if this _is_ you): https://bugs.webkit.org/show_bug.cgi?id=167788 — Rhythmic Fistman, Mar 16 '22 at 10:40
It's not my bug report, but what's insane, is that it was reported back in 2017. — Conrad, Mar 16 '22 at 11:09
You could run UI into the webview and webrtc in swift like some used to do before Apple enable it in WKwebview, and just expose your RTCPeerconnection and GetUsermedia API to the WKwebview jsdom (Ex in an objc old project: https://github.com/common-tater/wkwebview-webrtc-shim/blob/master/WKWebViewWebRTCShim/WKWebViewWebRTCShim.m). That 's a quitte big work but easier than having to re develop the entire project in swift. There is ios webrtc build available or you can easily build one. `while the application is in the background` you mean when you receive a voip push payload? — Pierre Noyelle, Mar 16 '22 at 16:32
Thanks @PierreNoyelle, ill take a look at that. Yes, the ```while the application is in the background``` means any time my app isn't the front most app. Either by being sent to the background (question 2), or (question 3) starting up in the background (from a voip push). To be clear, the microphone only needs to activate when the user taps Answer. If the users taps answer on the Callkit lock screen, my app never goes to the foreground. Remember all the the voip push can do is wake the app, and give it the time to register on the network so it can receive the call as per normal. — Conrad, Mar 17 '22 at 09:04
@PierreNoyelle - this rather cleaver project appears to solve the wrong parts of the problem. While a WKWebView can now do all the getUserMedia and PeerConnection work, its the MediaStream that's the problem - at a point (onTrack) I need to play out the audio with an HTML5 — Conrad, Mar 17 '22 at 09:25
Hello @Conrad I've been searching for answers to your 1, 2 & 3 for a couple of days now. I simply don't want to give up and "go all-in native". Did you find any workarounds? — Richard Gustavsson, May 23 '22 at 23:39

score 0 · Answer 1 · answered Nov 25 '22 at 14:22

for 1) Maybe someone also is following this approach and can add some insight/correct wrong assumptions: The audio in a WebRTC site is represented as a Mediastream. Maybe it is possible to get that stream from without the WKWebView and play it back within the app somehow ? This code should pass on some Buffers, but they are empty when they arrive over in swift:

//javascript
...
someRecorder = new MediaRecorder(audioStream);
someRecorder.ondataavailable = async (e) =>
{
    window.webkit.messageHandlers.callBackMethod.postMessage(await e.data.arrayBuffer());
}
mediaRecorder.start(1000);

and then in swift receive it like

//swift
import UIKit
import WebKit

class ViewController: UIViewController, WKScriptMessageHandler {
...
let config = WKWebViewConfiguration()
config.userContentController = WKUserContentController()
config.userContentController.add(self, name: "callBackMethod")
let webView = WKWebView(frame: CGRect(x: 0, y: 0, width: 10, height: 10), configuration: config)
...
}
func userContentController(_ userContentController: WKUserContentController, didReceive message: WKScriptMessage) {
    addToPlayingAudioBuffer(message.body)
    //print(message.body) gives the output "{}" every 1000ms.
}

Would this not create an (up to ) 1000ms delay in the audio? Seems like quite a high cost. — Conrad, Nov 30 '22 at 06:06
Yes, it would. And when pursuing that workaround one would be faced with the task of reassembling that buffer back into audio which might add even more delay on top of that. But if there would be a more direct way to "hijack" the stream into the app's AVAudiosession it could be a feasable workaround, maybe someone comes up with one.. — svveptuum, Dec 01 '22 at 09:39

WebRTC running from WKWebView AVAudioSession development roadblock

1 Answers1