(Google Speech API) What is frame size?

Question

The Google Speech to Text documentation recommends using a 100 ms frame size to minimize latency.

Any frame size is acceptable. Larger frames are more efficient, but add latency. A 100-millisecond frame size is recommended as a good tradeoff between latency and efficiency.
-Best Practices

However, what is frame size I do not know. Is the frame size the same as the AudioBuffer.length?

AudioBuffer.length

dhauptman · Answer 1 · 2018-12-10T16:41:55.920

0

The frames are chunks of StreamingRecognizeRequest messages that can contain one of the two fields: streaming_config and audio_content. The first StreamingRecognizeRequestmessage will ship only the streaming_config, after that all the subsequent messages will ship audio_content.

You can find more details in this and this documentations.

edited Dec 10 '18 at 16:41

answered Dec 10 '18 at 16:34

dhauptman

974
7
14

I understand. Thank you for your perfect explanation! – 金城亜優 Dec 11 '18 at 02:16
I'm glad to know it helped. Please accept the answer if you consider valid, for the benefit of community:) Thanks! – dhauptman Dec 11 '18 at 10:52

(Google Speech API) What is frame size?

1 Answers1