I'm trying to create an application to transcribe a streaming audio recording. The idea is to capture the user's microphone stream using RecordRTC and send it in chunks to a gunicorn server using Socket.IO. The server will then create an input stream for Azure Speech to Text:
I'm trying to capture audio every x seconds with RecordRTC in a format that is accepted by Azure Speech to Text:
startRecording.onclick = function() {
startRecording.disabled = true;
navigator.getUserMedia({
audio: true
},
function(stream) {
recordAudio = RecordRTC(stream, {
type: 'audio',
mimeType: 'audio/wav',
desiredSampRate: 16000, // accepted sample rate by Azure
timeSlice: 1000,
ondataavailable: (blob) => {
socketio.emit('stream_audio', blob); // sends blob to server
console.log("sent blob")
},
recorderType: StereoAudioRecorder,
numberOfAudioChannels: 1
});
recordAudio.startRecording();
stopRecording.disabled = false;
},
function(error) {
console.error(JSON.stringify(error));
});
};
The blob returned by ondataavailable seems to return a byte string. However, for Azure Speech to Text I prefer to recieve chunks in wave format. It is possible to retrieve the entire recording in WAV format using getBlob(), but then the client only generates the file after stopRecording() is called.
Is there a way for RecordRTC to return a blob in wave format every x seconds? If not, what are other options to stream audio to Azure Speech to Text through Gunicorn?
All help is much appreciated!