Recording browser microphone input to a server for live voice recognition

Question

I have access to IBM Watson's Speech-To-Text API which allows streaming via WebSockets, and I'm able to call getUserMedia() to instantiate a microphone device in the browser, but now I need to work out the best way to stream this information in real-time.

I intend for a three-way WebSocket connection from browser <=> my server <=> Watson using my server as a relay for CORS reasons.

I have been looking at WebRTC and various experiments, but all of these seem to be inter-browser peer-to-peer and not client-to-server like I intend.

The only other examples (e.g. RecordRTC) I've come across are seemingly based around recording a WAV or a FLAC file from the MediaStream returned by getUserMedia() and then sending the file to the server, but this itself has two problems:

The user shouldn't have to press a start or a stop button - it should just be able to listen to the user at all times.
Even if I make a recording and stop it when there's a period of silence, there will be an unreasonable time delay between speaking and getting a response from the server.

I'm making a proof of concept and if possible, I'd like this to work on as many modern browsers as it can - but most importantly, mobile browsers. iOS seems to be out of the question on this one though.

http://caniuse.com/#feat=stream

http://caniuse.com/#search=webrtc

Lets assume I just have this code for now:

// Shimmed with https://raw.githubusercontent.com/webrtc/adapter/master/adapter.js
navigator.mediaDevices.getUserMedia({ audio: true })
.then(function (mediaStream) {
    // Continuously send raw or compressed microphone data to server
    // Continuously receive speech-to-text services
}, function (err) {
    console.error(err);
});

See http://stackoverflow.com/questions/34972529/can-i-use-some-sort-of-local-storage-as-a-temporary-holding-place-for-getusermed/34997248#34997248 — jib, Feb 02 '16 at 01:52
@jib according to https://developer.mozilla.org/en-US/docs/Web/API/MediaRecorder#Browser_compatibility only Firefox OS supports MediaRecorder on mobile — Chris Watts, Feb 02 '16 at 22:21
@CJxD I understand you want to send media stream from browser to server, efficient way to do it is using MCU like kurento or Licode. — mido, Feb 03 '16 at 00:10
Right, you wanted mobile (and `MediaRecorder` only does audio on Firefox for Android at the moment it seems). Better go with an MCU then, as mido suggests. — jib, Feb 03 '16 at 03:00
I don't think I can use an MCU as these all seem to require software installation. I'm using a Platform as a Service for the web host so I don't have the ability to install stuff unless I buy a separate server. For a proof of concept this seems overkill. — Chris Watts, Feb 03 '16 at 19:24
RecordRTC seems to be able to record audio cross-platform but I just cant seem to get around the limitation of having to stop recording before a file is available. Perhaps the only way forward is to fork that project? — Chris Watts, Feb 03 '16 at 19:25

Recording browser microphone input to a server for live voice recognition

0 Answers0