1

While experimenting with webhook responses in DialogFlow I return a mixed audio and spoken response. The Actions Test Console reads it out litteraly (that is all the xml-tags are read out loud e.t.c.) but when I click the Audio tab in the same test console to find out what's wrong in the XML the test console reads/plays the sound and words correctly as if there's nothing wrong.

What could cause this?

Addendum: This is the response I produce in Javascript:

    conv.ask(`<speak>Här kommer ljudet.</speak>` +
    `<speak><par><media xml:id="environment" end="effect.end"fadeOutDur="3.0s"><audio src="${ljud3}" /></media>` +
    `<media xml:id="effect"><audio src="${ljud1}" begin="2.0s" /> </media></par></speak>`);}

in the audio tab in the Actions Console it looks like this and works as expected when I press "Update and listen":

    <speak>Här kommer ljudet.</speak><speak><par><media xml:id="environment" end="effect.end" fadeOutDur="3.0s"><audio src="https://www.sigvardson.se/public/running_on_gravel.ogg" /></media><media xml:id="effect"><audio src="https://actions.google.com/sounds/v1/cartoon/clang_and_wobble.ogg" begin="2.0s" /> </media></par></speak>

and the response tabs in the console looks like this:

{"payload": {
"google": {
  "expectUserResponse": true,
  "richResponse": {
    "items": [
      {
        "simpleResponse": {
          "textToSpeech": "<speak>Här kommer ljudet.</speak><speak><par><media xml:id=\"environment\" end=\"effect.end\" fadeOutDur=\"3.0s\"><audio src=\"https://www.sigvardson.se/public/running_on_gravel.ogg\" /></media><media xml:id=\"effect\"><audio src=\"https://actions.google.com/sounds/v1/cartoon/clang_and_wobble.ogg\" begin=\"2.0s\" /> </media></par></speak>"
        }
      },
      {
        "simpleResponse": {
          "textToSpeech": "<speak>Vill du höra <break time=\"500ms\"/> mer?</speak>"
        }
      }
    ],
    "suggestions": [
      {
        "title": "ja"
      },
      {
        "title": "nej"
      }
    ]
  }
}

} }

Oortone
  • 175
  • 7
  • There are many things that *could* cause this. It would probably be best if you updated your question to show how you're generating the SSML, the contents of the "response" tab, and anything else that could help point you in the right direction. – Prisoner Feb 22 '20 at 16:53
  • And you mention that you're using "the Actions Test Console". Is this for Actions on Google? If so, you may wish to add the tag "actions-on-google" as well. – Prisoner Feb 22 '20 at 16:54
  • I added more information. I found it rather strange that it only gets out wrong the first time but not when triggered again without alteration in the Audio tab. – Oortone Feb 22 '20 at 17:41

1 Answers1

2

The issue is that you have two <speak> tags in your response. If you change this to just have one <speak> tag around the entire thing, it should work better.

Prisoner
  • 49,922
  • 7
  • 53
  • 105
  • Yes, that was the problem. I noticed I can still use `````` inside meda tags if i need to mix audio and speech. Thanks. – Oortone Feb 23 '20 at 00:00