1

This is a bit hard to explain, so I am hoping that using an example would be most efficient.

We are building a service that allows a Parent to maintain a list of their Children and perform actions against these Children.

The Parent is the user making utterances to the DF agent/intent.

The Parent adds their children (names) to the database through a Web UI (Non DF) PRIOR to using DF.

This means we might have a database representation such as the following

PARENT TABLE

ID  Name
1   User A
2   User B     

CHILD TABLE

ID  NAME  PARENT_ID
1   John  1
2   Jon   2
3   Jake  2

The intent phrases have the following format and parameters

"Do {ACTION} for {CHILD_NAME}"

The problem that we are running into is how to solve this so that when the parameters are extracted by the agent, that they are passed to the fulfillment with the correct child name so that we can use that name for the user to run validation and provide context to the fulfillment.

For example, if User A makes the following utterance

"Do {ACTION} for John"

How do we ensure that the agent extracts "John" and not "Jon" when it passes the parameters to the fulfillment?

I have seen several suggestions around session entities, and even read through City Streets Trivia Example, however session entities seem to rely on the idea of the session specific values (child names in my example, street names in the example provided in link) are globally maintained and not specific to any user.

I am not sure how this would would work in my case. I can't be expected to maintain a list of ALL possible names, and even if I could, I would have multiple entries for John and Jon and would still not have a way of the agent knowing which one to use I would assume.

Maybe there is a way to dynamically add entity placeholders for each use and the possible values of the parameters for that entity dynamically based on the values we store in the database, but this seems unmaintainable and unrealistic?

What is the solution for this type of problem with conversational design in DF? It seems very common (Marking items off todo list or shopping list apps).

TheJediCowboy
  • 8,924
  • 28
  • 136
  • 208

1 Answers1

1

That was a long question, so I'll mostly respond to

How do we ensure that the agent extracts "John" and not "Jon" when it passes the parameters to the fulfillment?

This might be too simple, but after you extract the name, you could confirm the spelling with the user with a follow up question.

User: "Do {ACTION} for John"

Agent: "Ok, I heard, 'Do {ACTION} for John.' Is that correct?"

User: "Yes"

// call updateDatabase()

You'll still have an issue for Google Assistant users using devices without screens, but you could write additional logic to handle response branching for surface capabilities, like...

const hasScreen =
conv.surface.capabilities.has('actions.capability.SCREEN_OUTPUT');
const hasAudio =
conv.surface.capabilities.has('actions.capability.AUDIO_OUTPUT');
const hasMediaPlayback =
conv.surface.capabilities.has('actions.capability.MEDIA_RESPONSE_AUDIO');
const hasWebBrowser =
conv.surface.capabilities.has('actions.capability.WEB_BROWSER');

If the surface doesn't have a screen, you can split the name string into an array, and read each character back to the user, like...

Agent: "Ok, I heard, 'Do {ACTION} for John, spelled 'J', 'O', 'H', 'N'.' Is that correct?"
Max Wiederholt
  • 668
  • 7
  • 12