HTML5 & Web audio api: Streaming microphone data from browser to server. Ideal transports and data compression

Question

I am looking to take the audio input from the browser and stream it to multiple listeners. The intended use is for music, so the quality must mp3 standard or thereabouts.

I have attempted two ways, both yielding unsuccessful results:

WebRTC

Streaming audio directly between browsers works fine, but the audio quality seems to be non-customisable though what I have seen. (I have seen that it is using the Opus audio codec, but seems to not expose any controls).
Does anyone have any insight into how to increase the audio quality in WebRTC streams?

Websockets

The issue is the transportation from the browser to the server. The PCM audio data I can acquiring via the method below has proven too large to repeatedly stream to the server via websockets. The stream works perfectly in high speed internet environments, but on slower wifi it is un-usable.

var context = new webkitAudioContext()
navigator.webkitGetUserMedia({audio:true}, gotStream)

function gotStream (stream)
{
    var source = context.createMediaStreamSource(stream)
    var proc = context.createScriptProcessor(2048, 2, 2)

    source.connect(proc)
    proc.connect(context.destination)
    proc.onaudioprocess = function(event)
    {
        var audio_data = event.inputBuffer.getChannelData(0)|| new Float32Array(2048)
        console.log(audio_data)
        // send audio_data to server
    }
}

So the main question is, is there any way to compress the PCM data in order to make it easier to stream to the server? Or perhaps there is an easier way to go about this?

lyadAssaf: do u hv a sample code of streaming audio over websocket? if yes can you pls provide the sample.thnx. — Pradeep, Apr 02 '14 at 12:22
You might now the answer to this question: https://stackoverflow.com/questions/56308420/how-to-convert-the-float32array-format-of-native-html5-recorded-audio-to-proper I really need some help with this — George Pligoropoulos, May 25 '19 at 21:33

score 4 · Answer 1 · answered Dec 23 '13 at 20:25

4

There are lots of ways to compress PCM data, sure, but realistically, your best bet is to get WebRTC to work properly. WebRTC is designed to do this - adaptively stream media - although you don't define what you mean by "multiple" listeners (there's a huge difference between 3 listeners and 300,000 simultaneous listeners).

answered Dec 23 '13 at 20:25

cwilso

13,610
1
30
35

I am hoping to have a considerable number (possibly up to 300,00) of listeners which is why I was kind of leaning towards the web sockets ,but if you think WebRTC is feasible, is there any way to control the audio quality? I recognise that the technology is very much meant for voice, but due to the NetEQ and echo/noise canceller in the VoiceEngine class ([reference](http://www.webrtc.org/reference/architecture)), I suppose there isn’t any way to change this? I suppose that there may be high level access to such classes in a later version of the WebRTC draft. – IyadAssaf Dec 24 '13 at 12:00
Your problem is going to be scaling. Most systems aren't going to directly support 3000 simultaneous socket connections pumping an audio stream... That's just a lot of data. – cwilso Dec 24 '13 at 18:27

score 2 · Answer 2 · answered Dec 26 '13 at 11:36

There are several possible ways of resampling and/or compressing your data, none of them native though. I resampled the data to 8Khz Mono (your mileage may vary) with the xaudio.js lib from the speex.js environment. You could also compress the stream using speex, though that is used usually for audio only. In your case, I would probably send the stream to a server, compress it there and stream it to your audience. I really don't believe a simple browser to be good enough to serve data to a huge audience.

score 0 · Answer 3 · answered Nov 17 '19 at 07:04

WebRTC seems to default to one mono channel around 42 kb/s, it seems to be primarily designed for voice.

You can disable the audio processing features using constraints to get a more consistent input from the browser using:

navigator.mediaDevices.getUserMedia({ audio: { autoGainControl: false, channelCount: 2, echoCancellation: false, latency: 0, noiseSuppression: false, sampleRate: 48000, sampleSize: 16, volume: 1.0 } });

Then you also should set stereo and maxaveragebitrate params on the SDP:

let answer = await peer.conn.createAnswer(offerOptions);
answer.sdp = answer.sdp.replace('useinbandfec=1', 'useinbandfec=1; stereo=1; maxaveragebitrate=510000');
await peer.conn.setLocalDescription(answer);

This should output a string which looks like this:

a=fmtp:111 minptime=10;useinbandfec=1; stereo=1; maxaveragebitrate=510000

This could increase the bitrate up to 520kb/s for stereo, which is 260kps per channel. Actual bitrate depends on the speed of your network and strength of your signal tho.

HTML5 & Web audio api: Streaming microphone data from browser to server. Ideal transports and data compression

3 Answers3