Using Azure Speech Service, I'm trying to transcribe a bunch a wav files (compressed in the PCMU aka mu-law format).
I came up with the following code based on the articles referenced below. The code works fine sometimes with few files, but I keep getting Segmentation fault
errors while looping a bigger list of files (~50) and it never break on the same file (could be 2nd, 15th or 27th).
Also, when running a subset of files, transcription results seems the same with or without the decompression part of the code which makes me wonder if the decompression method recommended by Microsoft works at all.
import azure.cognitiveservices.speech as speechsdk
def azurespeech_transcribe(audio_filename):
class BinaryFileReaderCallback(speechsdk.audio.PullAudioInputStreamCallback):
def __init__(self, filename: str):
super().__init__()
self._file_h = open(filename, "rb")
def read(self, buffer: memoryview) -> int:
try:
size = buffer.nbytes
frames = self._file_h.read(size)
buffer[:len(frames)] = frames
return len(frames)
except Exception as ex:
print('Exception in `read`: {}'.format(ex))
raise
def close(self) -> None:
try:
self._file_h.close()
except Exception as ex:
print('Exception in `close`: {}'.format(ex))
raise
compressed_format = speechsdk.audio.AudioStreamFormat(
compressed_stream_format=speechsdk.AudioStreamContainerFormat.MULAW
)
callback = BinaryFileReaderCallback(filename=audio_filename)
stream = speechsdk.audio.PullAudioInputStream(
stream_format=compressed_format,
pull_stream_callback=callback
)
speech_config = speechsdk.SpeechConfig(
subscription="<my_subscription_key>",
region="<my_region>",
speech_recognition_language="en-CA"
)
audio_config = speechsdk.audio.AudioConfig(stream=stream)
speech_recognizer = speechsdk.SpeechRecognizer(speech_config, audio_config)
result = speech_recognizer.recognize_once()
return result.text
Code is running on WSL.
I have already tried:
- Logging a more meaningful error with
faulthandler
module - Increasing Python stack limit:
resource.setrlimit(resource.RLIMIT_STACK, (resource.RLIM_INFINITY, resource.RLIM_INFINITY))
- Adding some sleep timers
References: