using microphone input for speech recognition and web audio together?

Question

I would like to do speech analysis in the browser. I have a microphone input as my main stream that is created when I start the speech recognition object. I would like to get frequencies from the same stream. How do I connect the audio context source to the same microphone stream as the voice recognition one? Do I have to request for microphone permissions twice? I tried the code below but the getMicData() will only log '0' values.

JS

 var SpeechRecognition = SpeechRecognition || webkitSpeechRecognition
var requestAnimationFrame = window.requestAnimationFrame || window.mozRequestAnimationFrame ||
                            window.webkitRequestAnimationFrame || window.msRequestAnimationFrame;

var cancelAnimationFrame = window.cancelAnimationFrame || window.mozCancelAnimationFrame;
let audioCtx, analyser;
let amplitude;
let bufferLength;
let dataArray;
let bassArray;
let trebleArray;

let recognition = new SpeechRecognition();
recognition.continuous = false;
recognition.lang = 'en-US';
recognition.interimResults = false;
recognition.maxAlternatives = 1
let animationRequest;
const recordbtn = document.getElementById('record');

recordbtn.addEventListener('click', () => {
   // start speech rec
   recognition.start();
   audioCtx = new(window.AudioContext || window.webkitAudioContext)();
   analyser = audioCtx.createAnalyser();
   analyser.fftSize = 512;
   analyser.smoothingTimeConstant = 0.85;
   
})

recognition.onstart = function () {
   document.getElementById('font-name').innerHTML = "START SPEAKING";
   getMicData()
}

recognition.onspeechend = function () {
   cancelAnimationFrame(animationRequest);
}
function getMicData() {
animationRequest =  window.requestAnimationFrame(getMicData)
   bufferLength = analyser.fftSize;
   dataArray = new Uint8Array(bufferLength);
   analyser.getByteFrequencyData(dataArray);

   let maxAmp = 0;
   let sumOfAmplitudes = 0;

   for (let i = 0; i < bufferLength; i++) {
      let thisAmp = dataArray[i]; // amplitude of current bin
      if (thisAmp > maxAmp) {
         sumOfAmplitudes = sumOfAmplitudes + thisAmp;
      }
   }
   let averageAmplitude = sumOfAmplitudes / bufferLength;
   console.log(averageAmplitude)
   return averageAmplitude;
}

you may review web audio controllers . IMO the intent is to be able to instantiate a sourceBuffer ( mic ) and to pipe it successively thru each controller that u want to hook up to the bufferSource . controller 1 for speechRecogn controller 2 for the meter . not an expert but there may be samples of this type of pattern - https://www.npmjs.com/package/microphone-stream — Robert Rowntree, Dec 12 '20 at 19:53
https://github.com/AnthumChris/fetch-stream-audio read the "background" here also. i dont mean for u to go server-side if the recognizer an the meter are both OK client-side. fwiw - id be inclined to do recognition using a diff api and todo it server-side. node streaming ideas not relevant if you keep everything client-side — Robert Rowntree, Dec 12 '20 at 21:38
I acctually have two prototype, one uses google speech and one is web audio api. With the google speech one I'm almost there, I have a recording of the transcribed audio saved to the same file system everytime, however I don't how to extract high or low frequencies from the array of amplitudes — Laiqa Mohid, Dec 13 '20 at 12:33

using microphone input for speech recognition and web audio together?

0 Answers0