1

I’m currently working on a live translation web app allowing multiple participants to use the Azure Speech Translation and share their transcriptions in multiple languages.

I don’t want to be billed for the number of participants X the duration of a meeting. Hence the question: How can I activate the recognition only when speech is detected? This way, I would only pay for the people currently speaking.

I tried to use the speechStartDetected event from the TranslationRecognizer class, but this event seems to fire only when the recognizer is currently recognizing (with recognizeOnceAsync() or startContinuousRecognitionAsync())

Is there any parameter within the Speech SDK I can use to achieve what I want? If not, what are my options?

It might be possible to watch the audio dB level and activate the continuous recognition accordingly, but I think I will run into some problems If I try to do it this way. Ex: Once the audio level reach a certain level for a certain duration, this would trigger the startContinuousRecognitionAsync(), but it would miss the beginning of the speech…

Thanks in advance!

Simon
  • 11
  • 2

1 Answers1

0

Real time speech to text solution for live calls, With Speech to Text, you pay as you go based on the number of hours of audio you transcribe. Learn to view your billing invoice and usage data at https://learn.microsoft.com/en-us/azure/billing/billing-download-azure-invoice-daily-usage-date

Ram
  • 2,459
  • 1
  • 7
  • 14