Sending Twilio Stream to Azure speech translation

Question

Im working on a solution to do real-time speech translation using azure-cognitiveservices-speech.

The azure solution works fine link to boilerplate code from azure only with audio file or input from microphone. When I attempt to pass a stream from a voice service provider Twilio , it breaks with the reasoncancelled error.

I used the Twilio Boilerplate code for transcribing a stream with Google Speech and it works fine.

Any direction on this will be of great help

Some questions that I explored before I decided to add my own question

72769149

score 0 · Answer 1 · answered Mar 09 '23 at 19:24

this is Darren from the Microsoft Azure Speech SDK team. We have sample code to show how to do speech translation with input audio stream. Please have a look at https://github.com/Azure-Samples/cognitive-services-speech-sdk and find the samples based on your programming language. Also please read the public documents, including "how to use an audio input stream" - https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-use-audio-input-streams . The default input format is raw PCM (uncompressed) 16khz sample rate, mono, 16-bit integer. If this is not your format (audio at different sample rate, or compressed audio) there are options to configure Speech SDK to accept that format, or you do the decoding/resampling yourself before feeding audio into Speech SDK.

In order to help you further, please

Open an issue here: https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues
Provide your source code
Provide the error message you get
Provide the format of you input audio (example WAV file will help)
Provide a Speech SDK log of your run (https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-use-logging).

Hello Darren, Thank you so much for the help . I observed that the code sample was only availale in C#. By any chance this code sample for audio stream settings available in other languages like Python or Nodejs . — Rajesh Rajamani, Mar 10 '23 at 08:28
Look at Speech Recognition samples as well as translation. See if there are samples showing how to handle input streams. The APIs in the different programming languages are similar, so even if you don't find exactly what you need in one programming language, have a look at samples in another language to get an idea. And as mentioned above, if you can't solve the issue, please open a GitHub issue. — Darren Cohen, Mar 13 '23 at 13:53
Hi Darren, After some unsuccessful attempts with the audio formatting have added an issue along with my complete source code (without any secrets ) and the error log. Here is the issue. Will keep track https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/1882 — Rajesh Rajamani, Mar 20 '23 at 11:04

Sending Twilio Stream to Azure speech translation

1 Answers1