0

Im working on a solution to do real-time speech translation using azure-cognitiveservices-speech.

The azure solution works fine link to boilerplate code from azure only with audio file or input from microphone. When I attempt to pass a stream from a voice service provider Twilio , it breaks with the reasoncancelled error.

I used the Twilio Boilerplate code for transcribing a stream with Google Speech and it works fine.

Any direction on this will be of great help

Some questions that I explored before I decided to add my own question

72769149

Rajesh Rajamani
  • 189
  • 2
  • 14

1 Answers1

0

this is Darren from the Microsoft Azure Speech SDK team. We have sample code to show how to do speech translation with input audio stream. Please have a look at https://github.com/Azure-Samples/cognitive-services-speech-sdk and find the samples based on your programming language. Also please read the public documents, including "how to use an audio input stream" - https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-use-audio-input-streams . The default input format is raw PCM (uncompressed) 16khz sample rate, mono, 16-bit integer. If this is not your format (audio at different sample rate, or compressed audio) there are options to configure Speech SDK to accept that format, or you do the decoding/resampling yourself before feeding audio into Speech SDK.

In order to help you further, please

Darren Cohen
  • 126
  • 6
  • Hello Darren, Thank you so much for the help . I observed that the code sample was only availale in C#. By any chance this code sample for audio stream settings available in other languages like Python or Nodejs . – Rajesh Rajamani Mar 10 '23 at 08:28
  • Look at Speech Recognition samples as well as translation. See if there are samples showing how to handle input streams. The APIs in the different programming languages are similar, so even if you don't find exactly what you need in one programming language, have a look at samples in another language to get an idea. And as mentioned above, if you can't solve the issue, please open a GitHub issue. – Darren Cohen Mar 13 '23 at 13:53
  • Hi Darren, After some unsuccessful attempts with the audio formatting have added an issue along with my complete source code (without any secrets ) and the error log. Here is the issue. Will keep track https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/1882 – Rajesh Rajamani Mar 20 '23 at 11:04