0

Google Assistant SDK: My user input is always constant, meaning same user command all the time, so instead of asking the user to everytime record a voice commandvia a device microphone, I want to have the user press a button and execute the command by passing a pre-recorded audio file as input. Is that possible with the Google assistant SDK? Preferably in Python as I want to built an API endpoint around it

Any links, blogs, tutorials, samples etc would be very helful

MrLeeh
  • 5,321
  • 6
  • 33
  • 51
joke4me
  • 812
  • 1
  • 10
  • 29

2 Answers2

2

With the Google Assistant SDK, it accepts and text or audio data as an input.

It's currently something that can be shown in the pushtotalk sample.

Here's a few code snippets showing how it is done in the sample:

audio_source = audio_helpers.WaveSource(
        open(input_audio_file, 'rb'),
        sample_rate=audio_sample_rate,
        sample_width=audio_sample_width
# ...
# Create conversation stream with the 
# given audio source and sink.
conversation_stream = audio_helpers.ConversationStream(
    source=audio_source,
    sink=audio_sink,
    iter_size=audio_iter_size,
    sample_width=audio_sample_width,
)
# ...
with SampleAssistant(lang, device_model_id, device_id,
                     conversation_stream,
                     grpc_channel, grpc_deadline,
                     device_handler) as assistant:
    # If file arguments are supplied:
    # exit after the first turn of the conversation.
    if input_audio_file or output_audio_file:
        assistant.assist()
        return
Nick Felker
  • 11,536
  • 1
  • 21
  • 35
  • Thanks Nick, have tried pushtotalk and textinput, it's awesome! But haven't figured out yet why audio input (pushtotalk -i file.wav) doesn't work, no error or response from assistant – joke4me May 04 '18 at 13:32
  • I'm not really sure why that wouldn't work either. Can you try one of [the audio files](https://github.com/googlesamples/assistant-sdk-python/tree/master/google-assistant-sdk/tests/data) included in the sample? They're in the `riff` format fwiw. – Nick Felker May 04 '18 at 16:33
0

py -m googlesamples.assistant.grpc.pushtotalk --device-id "your device id" --device-model-id "your device model" -i "your_sound_file.wav"'

this will not wait for you to play a file, it will send the file and exeute and exit. but anyways, I had went to a website, can't remember which, but it would generate wav files of whatever you wanted. I did "turn on the porch light". Now I have a script that loops infinitely, checks if it can ping my cell phone on my home network, and if it detects its arrival, and it is between 7pm and 7am, it will turn on the porch light for me automatically. The only thing I have a problem with is, I do not need to hear the assistants voice. I wish there was a way to send an arg that muted assistant responses for the current "conversation". I use nircmd, copied to system32 folder in windows, to get around this though and I set the volume to 0, then do the assistant thing, then set the volume back to 20%. Works pretty good.

james28909
  • 554
  • 2
  • 9
  • 20