1

When using the prosody tag in SSML with Google Cloud TTS, I cannot adjust the attributes of individual words without creating an unwanted pause.

The code below creates a lag between 'New' and 'Video'. It has been suggested that a postprocessor can remove these pauses, but I'd like to know if there's a way of doing it directly within the code itself?

<speak>

Hello, and welcome to this<prosody pitch="+3st">New</prosody>Video Tutorial.

</speak>
Incertus
  • 57
  • 1
  • 6

2 Answers2

0

After testing, it appears there isn't a way of doing this using Google Cloud TTS. You can manually edit the sound file after generating it, but thay defeats the object of the exercise.

Incertus
  • 57
  • 1
  • 6
0

I don't have the cleanest answer, as what you are asking is not very supported. Prosody's pitch contour let's you change the tone of voice at different parts of the sentence.

Example of Prosody contour

<speak><prosody contour="(0%, +20Hz) (20%, +30%) (100%, +20%)"> Hello friends! </prosody></speak>

I am still playing around with this, but it seems like a tedious way of getting what you want done.

Using contour

contour takes a string of tuples "(%position in sentence, pitch adjustment) (..., ...)

I hope this helped and best of luck on your work!

Cleve Green
  • 729
  • 5
  • 12