As mentioned in the given below Microsoft documentation link, Audio can be streamed into the recognizer using the Speech SDK.
https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-use-audio-input-streams
Firstly, recognize the audio input stream format which must be supported in Azure cognitive services. Then, verify that your code meets with these requirements by providing the RAW audio
. Once done, then Adapt PullAudioInputStreamCallback to create your own audio input stream class. Depending on your audio format and input stream, create an audio configuration. When you construct your recognizer, provide both the audio input setup and your normal speech configuration to the recognizer
.
Example code: -
#
var audioConfig = AudioConfig.FromStreamInput(new
ContosoAudioStream(config), audioFormat);
var speechConfig = SpeechConfig.FromSubscription(...);
var recognizer = new SpeechRecognizer(speechConfig, audioConfig);
var result = await recognizer.RecognizeOnceAsync();// Run stream through recognizer.
var text = result.GetText();