2

I am trying to analyse the audio output from the browser, but I don't want the getUserMedia prompt to appear (which asks for microphone permission). The sound sources are SpeechSynthesis and an Mp3 file. Here's my code:

return navigator.mediaDevices.getUserMedia({
        audio: true
      })
      .then(stream => new Promise(resolve => {
        const track = stream.getAudioTracks()[0];
        this.mediaStream_.addTrack(track);
        this._source = this.audioContext.createMediaStreamSource(this.mediaStream_);
        this._source.connect(this.analyser);
        this.draw(this);

      }));

This code is working fine, but it's asking for permission to use the microphone! I a not interested at all in the microphone I only need to gauge the audio output. If I check all available devices:

navigator.mediaDevices.enumerateDevices()
.then(function(devices) {
  devices.forEach(function(device) {
    console.log(device.kind + ": " + device.label +
            " id = " + device.deviceId);
  });
})

I get a list of available devices in the browser, including 'audiooutput'. So, is there a way to route the audio output in a media stream that can be then used inside 'createMediaStreamSource' function? I have checked all the documentation for the audio API but could not find it. Thanks for anyone that can help!

GiulioG
  • 369
  • 4
  • 15
  • Where does this sound comes from? From what the different APIs give us, once it reached the output, it's already too late to catch it, you need to do it before, and this can sometimes be done, but in really different ways depending on the **source**. – Kaiido Feb 19 '18 at 08:55
  • The sound is coming from 2 sources: SpeechSynthesis and Mp3 file. – GiulioG Feb 19 '18 at 08:57
  • Would be good to include it as an [edit] – Kaiido Feb 19 '18 at 08:59

1 Answers1

1

There are various ways to get a MediaStream which is originating from gUM, but you won't be able to catch all possible audio output...

But, for your mp3 file, if you read it through an MediaElement (<audio> or <video>), and if this file is served without breaking CORS, then you can use MediaElement.captureStream. If you read it from WebAudioAPI, or if you target browsers that don't support captureStream, then you can use AudioContext.createMediaStreamDestination.

For SpeechSynthesis, unfortunately you will need gUM... and a Virtual Audio Device: first you would have to set your default output to the VAB_out, then route your VAB_out to VAB_in and finally grab VAB_in from gUM...

Not an easy nor universally doable task, moreover when IIRC SpeechSynthesis doesn't have any setSinkId method.

Kaiido
  • 123,334
  • 13
  • 219
  • 285
  • Thank you for your answer. I was really hoping there would be a way around gUM for analysing sound output for speech synthesis. It doesn't make sense to ask for microphone permission when all you need is sound output! – GiulioG Feb 19 '18 at 09:14
  • Don't tell me about it... But that's the current status of things, and it's even a pain for implementors who can't do automated tests without VAD... SS goes directly to the OS outputs, you've got no direct access to it. – Kaiido Feb 19 '18 at 09:17
  • 1
    Do you know if there's any plan to change this state of things in the future? I am embarking in a long term project that involves web audio and I don't want to write code that is going to become obsolete too quickly.. – GiulioG Feb 19 '18 at 09:21