I am trying to make a user interface where a user would click a mic button and speak to it and then a live stream transcription would appear on the screen. I also need to generate an audio and play it to the user as a reply to the user's input audio. There are already packages who do that(just need to npm install them), but I was wondering if I should use them or use AWS-SDK POLLY and client-transcribe-streaming packages offered by amazon web services. The already existing packages seem very easy to use but am not sure if that is reliable. Where as if I use AWS-SDK,it seems like setting up the mic and speaker of the browser and setting up the environment seems very complicated as I was trying it. The following code shows AWS-SDK method of transcribing live speaking streams but I couldn't see the transcribed texts. Is there something wrong with the code? Any suggestion would help
import { SECRET_ACCESS_KEY, ACCESS_KEY_ID } from "./transcribeGlobal.js";
import React, { useState } from "react";
import { TranscribeStreamingClient } from "@aws-sdk/client-transcribe-streaming";
import { MicrophoneStream } from "microphone-stream";
const accessKeyId = ACCESS_KEY_ID;
const secretAccessKey = SECRET_ACCESS_KEY;
const region = "us-east-1";
const languageCode = "en-US";
const transcribeClient = new TranscribeStreamingClient({
region,
credentials: { accessKeyId, secretAccessKey },
});
const request = {
AudioStream: {},
LanguageCode: languageCode,
MediaEncoding: "pcm",
SampleRateHertz: 44100,
};
function TranscribeClientSpeech() {
const [transcription, setTranscription] = useState("");
const handleStream = (stream) => {
const micStream = new MicrophoneStream(stream);
micStream.on("data", (chunk) => {
const audioChunk = new TextEncoder().encode(chunk);
request.AudioStream = { AudioEvent: { AudioChunk: audioChunk } };
transcribeClient.send(request);
});
transcribeClient.on("data", (response) => {
const transcript = response.TranscriptEvent.Transcript.Results.reduce(
(acc, result) => acc + result.Alternatives[0].Transcript,
""
);
setTranscription(transcript);
});
transcribeClient.on("error", (err) => {
console.error("Error with transcription stream", err);
});
};
console.log(transcription)
return (
<div>
<h1>Live Transcription</h1>
<p>{transcription}</p>
<button onClick={() => navigator.mediaDevices.getUserMedia({ audio: true }).then(handleStream)}>
Start Transcription
</button>
<p>{transcription}</p>
</div>
);
}
export default TranscribeClientSpeech;
Thanks for the suggestions in advance
I am expecting for the transcribed texts to show up, but it's still empty