How to receive answer from Google Assistant as a String, not as an audio stream

Question

I am using the python libraries from the Assistant SDK for speech recognition via gRPC. I have the speech recognized and returned as a string calling the method resp.result.spoken_request_text from \googlesamples\assistant\__main__.py and I have the answer as an audio stream from the assistant API with the method resp.audio_out.audio_data also from \googlesamples\assistant\__main__.py

I would like to know if it is possible to have the answer from the service as a string as well (hoping it is available in the service definition or that it could be included), and how I could access/request the answer as string.

Thanks in advance.

score 5 · Answer 1 · answered May 23 '17 at 12:29

5

Currently (Assistant SDK Developer Preview 1), there is no direct way to do this. You can probably feed the audio stream into a Speech-to-Text system, but that really starts getting silly.

Speaking to the engineers on this subject while at Google I/O, they indicated that there are some technical complications on their end to doing this, but they understand the use cases. They need to see questions like this to know that people want the feature.

Hopefully it will make it into an upcoming Developer Preview.

answered May 23 '17 at 12:29

Prisoner

49,922
7
53
105

thank you for your reply. I thought about feed the returned audio stream to the Assistant to get the query translation as text...but as you say it starts to get silly. Is just inefficient. Have you used gRPC with Python to communicate with the Speech API of Google Could Platform to process audio streams? – Towerss May 23 '17 at 23:04
That is best asked as a separate question. – Prisoner May 23 '17 at 23:08

score 0 · Accepted Answer · answered Aug 08 '18 at 22:51

Update: for

google.assistant.embedded.v1alpha2

the assistant SDK includes the field supplemental_display_text

which is meant to extract the assistant response as text which aids the user's understanding

or to be displayed on screens. Still making the text available to the developer. Goolge assistant documentation

How to receive answer from Google Assistant as a String, not as an audio stream

2 Answers2

Linked