1

I would like to write custom voice commands that can be used on any of the google listening devices in my home network (home/minis) and then return a specific response for those commands using the same google voice as all the default commands/queries you can ask it.

Is there a way to do this?

The end goal for the commands would be to hit an URL (local network) that would return a message to be replied from the google home/minis.

EDIT: I found a solution.
Google may change how it works so be aware of the time of this posting.
I used the google actions console. https://console.actions.google.com/u/0/ You can find a few tutorials on this that aren't too bad. Basically I created a default project (dont use any of the presets), created a scene, created a custom intent for each action I wanted to do and had a bunch of training phrases listed under that intent for the best chance of the correct voice capture.
Then under the scene I added each intent to the "user intent handling" spot.
For each of these I set the "intent" dropdown to one of my custom intents and for the "when intent is matched" section I set it to "call your webhook" and give it relevant name. I set mine to the same name as the intent for consistency. I just repeated this for all my custom intents.
Now I went to the "Webhook" tab on the left and made sure my fulfillment method was set to HTTPS endpoint.
This is the part that took me awhile. There is very little documentation on how to do this and some of it is actually wrong. For my case, I made a .net core 5.0 API as I am familiar with it, im sure you could use anything with an HTTPS endpoint. There are 2 things I learned here.

  1. some docs say that the handler name you give the trigger will be added to the endpoint. it DOES NOT.
  2. You can only actually have 1 endpoint. So if you want to many things you will have to just filter out the request object in your API and then route it to whatever its meant to do.
    The simplest object I found (that actually worked) for just returning text to google to be spoken out of your home/mini is here: https://developers.google.com/assistant/conversational/prompts-simple#json_1

Whatever it is you have your endpoint set to, just put it into the box for HTTPS endpoint and hit save.
I also learned that for .net this doesnt play nice with localhost. Im not an expert on networking and https, but after some googling an easy fix I found for testing in local host is to use NGROK. run it, pass it a command like "ngrok http https://localhost: -host-header=localhost:" and it should spit out 2 randomly generated .ngrok.io addresses. one http and one https. you can use the HTTPS one as the root for your API to pass to google and it should work!
You won't need to use the "deploy" option on google actions unless you want the public to have it. It ties to your account, so if your account is on your google home/minis it will be usable right away.
So far I have 5 phrases I can ask to my google products and it will route through my API, generate a response with a random number in it, return it and speak it out to me.
Once this is working you can tie into/talk to whatever you want with your API. its completely up to you at that point.

Feel free to add comments or message me with any questions or clarifications.

Thranor
  • 393
  • 1
  • 4
  • 13
  • Thanks for sharing your process. Would you mind also sharing screenshots of how to setup the scene and custom intent for an action? I wasn't able to follow along with just the descriptions above. – Inventor22 Aug 24 '21 at 23:36
  • @Inventor22 sure thing! So the easiest order may be an intent first. As per the intent image - create a new intent (red arrow) and then add your test phrases that will help to match what a user says. (blue arrow) then the scene image - create a new scene (red arrow), assign the user intent handling. its a dropdown that should have the intent you made already listed (test in my case here) (blue arrow), then set the intent matching to either webhook or prompt based on your needs (green arrow) – Thranor Aug 26 '21 at 01:25
  • @Inventor22 The final step is to go to the "Main Invocation" in the left menu, then as per that image - click the "when user says" box to edit it and setup your initial voice command. this is basically your setups version of "hello google" to make it start listening. (red arrow) Then set the "transition to" (blue arrow) to your scene by using the dropdown (yellow arrow). so calling the main command phrase will invoke the hello test program and transition to your scene. you can edit what it says to you when its listening (green arrow). i changed it to "test program is listening." – Thranor Aug 26 '21 at 01:30
  • Then you just need to go to "test" on the top bar menu and you can either talk to it or click the prompts. unless i goofed something up, it should work for you! a note: to use your google home products for this it will still require you to say "hey google" first. then your phrase. so in this cause it would be "hey google" wait for google to start listening, then say "Talk to hello test program" and it should say the reply phrase you put in. Hope it works! screenshot link: https://imgur.com/a/2vlfexo – Thranor Aug 26 '21 at 01:34
  • @Thranor thank you so much for this detailed walk through! I'm struggling to pass information to my webhook -- I have everything working, but I cannot find any data I'm trying to send in the google request object. How did you do this? – Kendra Feb 08 '22 at 23:52
  • @Kendra Im not sure without being able to look at it. If you go through the "test" tab as listed above in the post and send the request over from google you should be able to see the errors by clicking on the message that gave the error. It should list the webhook request and response. Make sure your code has those models matched perfectly. If you are testing locally on your PC with your code running on localhost it wont work either. A solution for that is listed above as well using NGROK – Thranor Feb 09 '22 at 20:49
  • @thranor thank you so so much for your response. I'm currently using NGROK and using the test tab per your (very good!) instructions. I'm able to successfully post to my server (HUZZAH!) but I cannot pass information through this post request. (I was assuming I could pass params or something?) For example, I'd like to say "Hey Google, record a feed. 4 oz" and I'd expect to get "record a feed 4 oz" somewhere in the post request body? – Kendra Feb 16 '22 at 19:48
  • @Kendra It should be there in the webhook request made by google. in the "test" tab of the action console when you run a test you can see the output on the right hand side if you expand the message box. it should have "webhookRequest" that you can further expand to see the exact JSON payload getting sent. on mine i see "requestJson" with "handler" as the first object which says which handler was triggered. the second object i see is "intent" which shows which intent was triggered as well as a property for "query" that shows the input phrase. this gets sent to your api. Can you confirm? – Thranor Feb 17 '22 at 21:46
  • @Thranor THANK YOU. Wow, why isn't this documented better?? This is exactly what I was looking for. Followup question -- passing params? – Kendra Feb 20 '22 at 02:06
  • @Thranor more detail -- I see "Add intent parameters" in the Actions Console and I see where the params should be showing up in my req below, but how do you verbally add this when you actually speak to google home? When saying the parameter name (what I would assume?), I am still getting nothing. Thank you again! `{ handler: { name: 'kendra_feed_sockboi' }, intent: { name: 'test', params: {}, query: 'test test test' }, scene: { name: 'test_scene', slotFillingStatus: 'FINAL', slots: { sockBoyNumber: [Object] }, next: { name: 'test_scene' } },...` – Kendra Feb 20 '22 at 02:14
  • @Thranor also I'm curious if maybe this should be a "slot" instead of an intent parameter (I have two main use cases, one is simple logging "I've done this!" and another is adding a variable amount (e.g. "Log 4 oz milk") but again the documentation is seriously lacking. – Kendra Feb 20 '22 at 02:17
  • @Kendra A quick read of slots and it does seem like that is what you will need. I didn't use them for my use case so I can't help you on how to get them working. Their documentation is really atrocious around this though. A lot of trial and error and I'm sure you'll get it. Keep using that test console and looking at the web request. Slots show up in "Scene" i believe. Good luck! – Thranor Feb 21 '22 at 12:54
  • @Thranor -- hi again! Good news is half of my project is working (the part where I'm tracking/parsing/routing what I'm saying). Less good news is I still cannot figure out how to take what I'm sending BACK to google and getting google to say it (e.g. getting response text into the "firstSimple" part of the JSON). Any help/guidance is much appreciated!! – Kendra Apr 13 '22 at 16:32

2 Answers2

0

The end-to-end of what you want is not possible on the Google Assistant platform. While you are able use routines, which let one map a given text query to a series of actions, these commands are generally limited to existing Google Assistant commands and functions like device control or open-ended input.

You could create a routine that maps to invoking a conversational action, which would give you the ability to programmatically control the response. However, this platform does the computation in the cloud and sends the response to the target device. This means that a local network URL cannot be accessed by the Assistant.

If you had some sort of way to access local URL data in a way that was via a publicly accessible URL (publicly accessible but you can add authentication) then it would be feasible.

This does not apply in the application of smart home devices, which do let you access local network URLs in the Local Home SDK.

Nick Felker
  • 11,536
  • 1
  • 21
  • 35
  • I actually figured out how to do it. it took me forever and their documentation is atrocious. I am using a .net core 5.0 API as my backend to do all the controls I want. I can post my process here if that would be useful to folks following behind. – Thranor Jun 16 '21 at 23:53
  • 1
    Posting a detailed answer could certainly be useful for future readers. – Nick Felker Jun 17 '21 at 17:18
0

Answered in edit of the initial question.

Thranor
  • 393
  • 1
  • 4
  • 13