How to pass audio buffer to speech to text service using python

Question

I am using azure speech to text service using python to process bunch of audios. In order to process the audios, These are the steps performed-

Download audio from web server to local 'C:/audio' drive.
Pass the path of downloaded audio to Speech SDK's - Audioconfig(filename ='C:/audio/my_audio.wav')

Rather than downloading to local machine, I want to get the file from server and pass it directly to speech to text service. For which,

I stored the audio in bytes form in audio buffer like this- raw_audio = my_audio_in_bytes # class <'bytes'>
Then, I pass the audiobuffer to AudioConfig(filename = raw_audio) - It doesn't works. Because it expects a filepath

Is there a way to pass audiobuffer to this service?

Configuration python code:

speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
audio_config = speechsdk.audio.AudioConfig(filename='C:/audios/audio1.wav')
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)

Check the documentation. The AudioConfig function accepts a `stream` parameter, although I cannot tell you what format it wants. — Tim Roberts, Mar 03 '21 at 19:56
Tested with stream as well, the push/pull streams take the raw data but the final transcripts are messy with lot of redundant words. Hence, tried this approach. — user1990, Mar 03 '21 at 20:01

score 0 · Accepted Answer · answered Mar 04 '21 at 02:18

0

@user1990, per our discussion on this GitHub issue, please use batch transcription, as Speech SDK does not directly support recognizing from a WAV file hosted on a web service (you will first need to download it locally).

answered Mar 04 '21 at 02:18

Darren Cohen

126
6

How to pass audio buffer to speech to text service using python

1 Answers1