I went through the documentation of Google Text to Speech SSML. https://developers.google.com/assistant/actions/reference/ssml#prosody
So there is a tag called <Prosody/>
which as per the documentation of W3 Specification can accept an attribute called duration which is a value in seconds or milliseconds for the desired time to take to read the contained text.
So <speak><prosody duration='6s'>Hello, How are you?</prosody></speak>
should take 3 seconds for google text to speech to speak this! But when i try it here https://cloud.google.com/text-to-speech/ , its not working and also I tried it in rest API.
Does google text to speech doesn't take duration attribute into account? If they don't then is there a way to achieve the same?