So I have a use-case where I want to upload audio files (.WAV) into a blob storage which triggers a Function and gets the text from the audio. At the moment, the only way possible is having the audio file locally. The audio config can't take the uri of the audio file. The code I'm using is this:
import azure.cognitiveservices.speech as speechsdk
speech_key, service_region = "sub-key", "westeurope"
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
audio_input = speechsdk.AudioConfig(filename="**BLOB URI**")
speech_recognizer = speechsdk.SpeechRecognizer(speech_config, audio_input)
result = speech_recognizer.recognize_once()
if result.reason == speechsdk.ResultReason.RecognizedSpeech:
print("Recognized: {}".format(result.text))
elif result.reason == speechsdk.ResultReason.NoMatch:
print("No speech could be recognized: {}".format(result.no_match_details))
elif result.reason == speechsdk.ResultReason.Canceled:
cancellation_details = result.cancellation_details
print("Speech Recognition canceled: {}".format(cancellation_details.reason))
if cancellation_details.reason == speechsdk.CancellationReason.Error:
print("Error details: {}".format(cancellation_details.error_details))
From my research, we can't have a uri as a filename (bold part of code). Solutions like downloading locally first won't work.
I tried reading the audio as a stream but I couldn't find a way to convert to an AudioInputStream.
Any help would be great. Thanks.