This is my first post on StackOverflow (long-time lurker, first-time poster), so go easy on me. ^__^;
For those having trouble in implementing play/pause/resume functionality with a STATIC mp3 I’m assuming the process is the same, so hopefully, this post will help you guys as well.
I’m working on building a live mp3 streaming Google Action, and I seem to be having issues with implementing it in the new Actions Console https://console.actions.google.com/
According to the Google Actions documentation found here: https://developers.google.com/assistant/conversational/prompts-media - Last updated 2021-03-10 UTC.
I should be able to invoke a Media Response to play an mp3 back to the user using the YAML / JSON example provided in the above link, however, it seems that playing, pausing, and resuming doesn’t work correctly with a streaming mp3 URL.
TLDR; Here's a shorter version of the write up: https://i.stack.imgur.com/tua7P.jpg
For a more detailed analysis see below:
STEPS TO REPRODUCE
Starting with the example provided in the documentation and popping the JSON version sample code (posted here for convenience) in the On Enter
section of the scene; I was able to play the media fine.
{
"candidates": [
{
"first_simple": {
"variants": [
{
"speech": "This is a media response."
}
]
},
"content": {
"media": {
"optional_media_controls": [
"PAUSED",
"STOPPED"
],
"media_objects": [
{
"name": "Media name",
"description": "Media description",
"url": "https://storage.googleapis.com/automotive-media/Jazz_In_Paris.mp3",
"image": {
"large": {
"url": "https://storage.googleapis.com/automotive-media/album_art.jpg",
"alt": "Jazz in Paris album art"
}
}
}
],
"media_type": "AUDIO"
}
}
}
]
}
Note: In the above JSON I removed the start_offset
node because it’s currently not supported by iOS and is probably put in there as an example for testing purposes.
Here’s an example of the static mp3 media response playing for reference: https://downloaddave.com/reviews/clients/momentum-br/ga-sr/Screenshot_streaming_playing_no_error_with_test_mp3.png
I noticed that pausing and resuming the static mp3 does not work unless you enabled the following System Intents:
MEDIA_STATUS_PAUSED
MEDIA_STATUS_STOPPED
MEDIA_STATUS_FAILED
MEDIA_STATUS_FINISHED
Otherwise, if you click on the “pause” icon on the Media Response Player or invoke the pause earcon (earcon = ear + icon) you will encounter the following errors:
Sorry, [Your Action’s Display Name] isn't responding right now. Please try again soon.
Did not find any handling for intent event 'actions.intent.MEDIA_STATUS_PAUSED' on scene 'playStreamingAudio'
{
"endConversation": {}
}
Under the Error and status handling
section of the scene I added the system intents as seen in the following screenshot.
Note that if I just transition the MEDIA_STATUS_PAUSED to “No Transition” it gives me an error message, Event handler for ‘playStreamingAudio’ has an empty function call and/or empty transition.
If it goes to “End Conversation” it ends the test and exits out of the Media Response Card rather than giving me the option to resume (which seems like a bad user/conversational flow and probably won't pass review).
Tapping the “pause” icon, typing, or saying “pause” doesn’t work unless the MEDIA_STATUS_PAUSED transitions to another Scene which I’ve called pauseStreamingAudio
.
In the pauseStreamingAudio
scene, I added a prompt letting the user know they can say “play” or “cancel” along with suggestions indicating the same.
{
"candidates": [
{
"first_simple": {
"variants": [
{
"speech": "You can say play to resume audio or cancel to quit."
}
]
},
"suggestions": [{
"title": "Play"
}, {
"title": "Cancel"
}]
}
]
}
From the pauseStreamingAudio
Scene, I added a custom intent “play” to go back to the previous Scene I’ve called playSreamingAudio
.
I’m not sure if I’m doing this right BUT IT WORKS!
Streaming mp3
Now that I got the foundation working I swapped out the static mp3 to the streaming audio. Here is the Sample Code JSON Builder with streaming mp3 link & “start_offset” removed and the streaming mp3 link.
{
"candidates": [
{
"first_simple": {
"variants": [
{
"speech": "This is a media response."
}
]
},
"content": {
"media": {
"optional_media_controls": [
"PAUSED",
"STOPPED"
],
"media_objects": [
{
"name": "Media name",
"description": "Media description",
"url": "https://prod-35-230-37-193.wostreaming.net/momentum-kvmifmaac-ibc2",
"image": {
"large": {
"url": "https://storage.googleapis.com/automotive-media/album_art.jpg",
"alt": "Jazz in Paris album art"
}
}
}
],
"media_type": "AUDIO"
}
}
}
]
}
The Content-Type of the streaming file that I’m testing with doesn't specifically end in a *.mp3 and when I check the content type is reads as audio/aacp.
Codec: ADTS
Type: Audio
Channels: Stereo
Sample Rate: 44100 Hz
Bits per Sample: 32
AAC Extension: SBR+PS
This works as and I'm able to stream audio form the source file. See screenshot below.
However, there is a display error on the Media Response Player at the time index by the bottom right Infinity:NaN:NaN
(highlighted in the red box).
Likely related, I can no longer trigger the Pause System Intent anymore. Instead, I get the following error:
https://downloaddave.com/reviews/clients/momentum-br/ga-sr/Screenshot_streaming_pause_error.png
Notice that the drop-down is open and there is no response for me to use and troubleshoot.
I also tried looking through the Actions on Google documentation to see if there could be something wrong with the audio stream I was providing, the best thing I could find was,
“Audio for playback must be in a correctly formatted MP3 file. MP3 files must be hosted on a web server and be publicly available through an HTTPS URL. Live streaming is only supported for the MP3 format.”
I found some info on mp3 specs on the SSML page here, but I’m not sure if this applies to the Media Response https://developers.google.com/assistant/conversational/ssml#audio - Last updated 2021-05-25 UTC.
Does anyone have any ideas on how I can get this working or even troubleshoot this?
Could some of these circumstances be an issue with the Media Player itself? How would one go about fixing this?
Anyway, I hope this helps somebody out there & thanks very much in advance. Any help is most appreciated.