2

I am very new to learn cognitive services of Text-to-Speech (TTS) of Microsoft Azure. I successfully able to convert the given text into an audio file by using TTS services of Azure.It works fine when I'm having a single voice element in my SSML XML document. The example of working SSML is;

<speak version="1.0" xml:lang="en-US">
  <voice xml:lang="en-US" xml:gender="Male" name="en-US-Jessa24kRUS"> 
       Hello, this is my sample text to convert into audio? 
  </voice>
</speak>

But, when I'm having multiple voice tags(on gender base), then it causes an error. The SSML of it is:

<speak version="1.0" xml:lang="en-US">
  <voice xml:lang="en-US" xml:gender="Male" name="en-US-Guy24kRUS"> What’s your name? </voice>
  <voice xml:lang="en-US" xml:gender="Female" name="en-US-Jessa24kRUS"> My name is Cindy Smith. Do you know John Silver?</voice>
  <voice xml:lang="en-US" xml:gender="Male" name="en-US-Guy24kRUS"> John and I are old friends. </voice>
  <voice xml:lang="en-US" xml:gender="Female" name="en-US-Jessa24kRUS"> John just joined our company as a salesperson. </voice>
  <voice xml:lang="en-US" xml:gender="Male" name="en-US-Guy24kRUS"> That’s good news. John has been a salesperson for chemical products for many years. </voice>
  <voice xml:lang="en-US" xml:gender="Female" name="en-US-Jessa24kRUS"> I head he really likes his new job.</voice>
</speak>

And the error is:

Response status code does not indicate success: 400 (SSML must contain a maximum of 5 voice elements. Actual 6.).

It'll be a great help for me if someone explain that why its limiting me to five voice tags, while there's no limitation mentioned in documentation.

Arsman Ahmad
  • 2,000
  • 1
  • 26
  • 34

1 Answers1

1

This is a known settings due to latency. We've been aware of and working on removing this limitation. Hope we could complete the fix and deployment in this week, if things go smoothly, we may complete earlier.

Ram
  • 2,459
  • 1
  • 7
  • 14