3

I have access to IBM Watson's Speech-To-Text API which allows streaming via WebSockets, and I'm able to call getUserMedia() to instantiate a microphone device in the browser, but now I need to work out the best way to stream this information in real-time.

I intend for a three-way WebSocket connection from browser <=> my server <=> Watson using my server as a relay for CORS reasons.

I have been looking at WebRTC and various experiments, but all of these seem to be inter-browser peer-to-peer and not client-to-server like I intend.

The only other examples (e.g. RecordRTC) I've come across are seemingly based around recording a WAV or a FLAC file from the MediaStream returned by getUserMedia() and then sending the file to the server, but this itself has two problems:

  1. The user shouldn't have to press a start or a stop button - it should just be able to listen to the user at all times.
  2. Even if I make a recording and stop it when there's a period of silence, there will be an unreasonable time delay between speaking and getting a response from the server.

I'm making a proof of concept and if possible, I'd like this to work on as many modern browsers as it can - but most importantly, mobile browsers. iOS seems to be out of the question on this one though.

http://caniuse.com/#feat=stream

http://caniuse.com/#search=webrtc

Lets assume I just have this code for now:

// Shimmed with https://raw.githubusercontent.com/webrtc/adapter/master/adapter.js
navigator.mediaDevices.getUserMedia({ audio: true })
.then(function (mediaStream) {
    // Continuously send raw or compressed microphone data to server
    // Continuously receive speech-to-text services
}, function (err) {
    console.error(err);
});
Chris Watts
  • 6,197
  • 7
  • 49
  • 98
  • 2
    See http://stackoverflow.com/questions/34972529/can-i-use-some-sort-of-local-storage-as-a-temporary-holding-place-for-getusermed/34997248#34997248 – jib Feb 02 '16 at 01:52
  • have looked into any MCU? – mido Feb 02 '16 at 02:14
  • @mido No - what do you mean by MCU in this context? – Chris Watts Feb 02 '16 at 19:53
  • @jib according to https://developer.mozilla.org/en-US/docs/Web/API/MediaRecorder#Browser_compatibility only Firefox OS supports MediaRecorder on mobile – Chris Watts Feb 02 '16 at 22:21
  • @CJxD I understand you want to send media stream from browser to server, efficient way to do it is using MCU like kurento or Licode. – mido Feb 03 '16 at 00:10
  • Right, you wanted mobile (and `MediaRecorder` only does audio on Firefox for Android at the moment it seems). Better go with an MCU then, as mido suggests. – jib Feb 03 '16 at 03:00
  • I don't think I can use an MCU as these all seem to require software installation. I'm using a Platform as a Service for the web host so I don't have the ability to install stuff unless I buy a separate server. For a proof of concept this seems overkill. – Chris Watts Feb 03 '16 at 19:24
  • RecordRTC seems to be able to record audio cross-platform but I just cant seem to get around the limitation of having to stop recording before a file is available. Perhaps the only way forward is to fork that project? – Chris Watts Feb 03 '16 at 19:25

0 Answers0