When calling the webhook multiple times in one scene and sending simple responses there is a bug at merging the simple responses.
prompt from the first webhook call
{
"override": false,
"firstSimple": {
"speech": "<speak><audio src=\"https://www.example.com/audio/file1.mp3\"></speak>",
"text": "Text 1"
}
}
prompt from the second webhook call
{
"override": false,
"firstSimple": {
"speech": "<speak><audio src=\"https://www.example.com/audio/file2.mp3\"></audio> <audio src=\"https://www.example.com/audio/file3.mp3\"></audio></speak>",
"text": " Text 2"
}
}
merged prompt in the response send to the user
{
"firstSimple": {
"speech": "<speak><speak><audio src=\"https://www.example.com/audio/file1.mp3\"></speak> <audio src=\"https://www.example.com/audio/file2.mp3\"/> <audio src=\"https://www.example.com/audio/file3.mp3\"/></speak>",
"text": "Text 1 Text2"
}
}
So with the two speak
tags the SSML is invalide and is not spoken out.
Sometimes the speech object is completely missing.
I already created an Github issue for that.