synapseml SpeechtoTextSDK with MP3 not working

Question

I am able to use the transformer SpeechToTextSDK of SynapseML to convert wav files to text in Databricks. However, with mp3, I have the error : 0x27 (SPXERR_GSTREAMER_INTERNAL_ERROR).

In their documentation (https://mmlspark.blob.core.windows.net/docs/0.10.0/pyspark/synapse.ml.cognitive.html#module-synapse.ml.cognitive.SpeechToTextSDK), it is clearly said that supports mp3 also :

"The file type of the sound files, supported types: wav, ogg, mp3

I have a list of audio files with different format :

wasbs://test@blobstorage.blob.core.windows.net/file1.mp3 wasbs://test@blobstorage.blob.core.windows.net/file2.wav

I used the following transformer in SpeechToTextSDK with the code below :

import synapse.ml
from synapse.ml.cognitive import *

stt = (SpeechToTextSDK()
       .setSubscriptionKey(YOUR_API_KEY)
       .setLocation(REGION)
       .setOutputCol("text")
       .setAudioDataCol("wavbytes")     
       .setFormatCol("format")
       .setLanguageCol("lang")
       .setStreamIntermediateResults(False)
      )

results = stt.transform(wav_audio_list)

Any one has an idea?

many thanks in advance,

score 0 · Answer 1 · answered Jul 27 '22 at 06:52

Just to share with you my solution : I installed Gstreamer through init script. The procedure is described here for Gstreamer () and for ini script configuration (https://docs.databricks.com/clusters/init-scripts.html)

Once installed Gstreamer and configured the ini script, the transformer run with multiple language and multiple format smoothly

synapseml SpeechtoTextSDK with MP3 not working

1 Answers1