0

An issue I am seeing is that when I ask in dialogflow for the user to spell out their user id like joesmith2014, there are a large number of errors. The follow post suggested that I can fix this by using speech context to tell the speech to text engine that the user will be spelling out alphanumerics.

https://stackoverflow.com/questions/62048288/dialogflow-regex-alphanumeric-speech

I can't figure out how you would do this while using the actions-on-google library or can this not be down in the fulfillment webhook?

Thanks.

Gavin Siu
  • 81
  • 1
  • 9

2 Answers2

0

As an example, I created an agent called “alphanumeric” because it will accept any alphanumeric value I send following the next steps:

  1. Check the box regexp entity
  2. Add a single entry, ^[a-zA-Z0-9]{3}[a-zA-Z0-9]*$
  3. Then save it

Your agent should look something like this:

Agent created

Please note that the regexp entity I added is strict in that it is looking only for a string of alphanumerics, without any spaces or dashes. This is important for two reasons:

  1. This regexp follows the auto speech adaptation requirements for enabling the "spelled-out sequence" recognizer mode.

  2. By not looking for spaces and only looking for entire phrases (^...$), you allow end-users to easily exit the sequence recognition. For example, when you prompt "what's your order number" and an end-user replies "no I want to place an order", the regexp will reject and Dialogflow will know to look for another intent that might match that phrase.

If you are only interested in numeric values, you can create a more tailored entity like [0-9]{3}[0-9]*, or even just use the built-in @sys.number-sequence entity.

Eduardo Ortiz
  • 715
  • 3
  • 14
  • I tried the entity but it does not solve the problem. Suppose I tell dialogflow the input will be @sys.number-sequence. The speech to text should convert what I say one three three to 133, but instead I get 1 free free. What the post suggest was to use Speech contexts, but I can't figure out how to do this within the action-on-google library. Perhaps it needs to be done somewhere before? – Gavin Siu Mar 11 '22 at 07:17
  • Note that the regexp entity also has the same issue. Even with Regexp, I still get free instead of three for example. – Gavin Siu Mar 11 '22 at 13:22
  • It might be because the audio is in a lower quality than the ones recommended from google, this [documentation](https://cloud.google.com/speech-to-text/docs/best-practices) could help you to understand what might be causing the **one free free** error – Eduardo Ortiz Mar 15 '22 at 22:52
  • I think the big problem is that I am seeing things from the dialogflow side. The architecture is that some sort of IVR is making a request to Dialogflow and goes through the Google Speech To Text. The speech class token has to come from the request, since by the time it hits the dialogflow, it's after the speech to text has already been processed. I am using the audio code IVR and it appears to have some provision to set up the token for subsequent request, so I may have to explore that. – Gavin Siu Mar 16 '22 at 14:21
  • As i can't replicate your issue with speech to text and dialogflow, my best suggestion is that you submit your question to [support](https://cloud.google.com/support-hub) – Eduardo Ortiz Mar 18 '22 at 17:48
0

Dialogflow ES fulfillment cannot affect speech recognition quality because speech-to-text processing happens before the Dilaogflow request is sent to Dialogflow fulfillment. Check the diagram in the Dialogflow ES Basics documentation.

You can improve speech recognition quality either by enabling auto speech adaptation in the agent settings or by sending speech contexts in the Dialogflow API requests. Note that speech contexts sent via API override implicit speech context hints generated by auto speech adaptation.

If you use regexp entities, make sure that your agent design meets all the requirements listed in this speech adaptation with regexp entities document. See an example of how an intent that collects the employee ID and satisfies these requirements may look like: enter image description here

When testing the agent, make sure that you test it consistently via voice, including the inputs preceding the utterance expected to match a regexp entity.

This tutorial for iterative confirmation of spoken sequences may also help with the agent design.

Svetlana
  • 88
  • 5