Google Cloud Speech API real time recognition

Question

I am developing a Python application for real-time translation. I need to recognize speech in real time: as user says something it automatically sends this piece of audio to Google Speech API and returns a text. So I want the recognized text appearing immediately while speaking.

I've found Streaming Speech Recognition but it seems that I still need to record the full speech first and then send it to the server. Also, there are no examples of how to use it in Python

Is it possible to do this with Google Speech API?

score 1 · Answer 1 · answered Nov 23 '17 at 12:13

You can do it with Google Speech API.

But, it has a 1 minute content limit.

Please check the link below.

https://cloud.google.com/speech/quotas

So you have to restart every 1 minute.

and the link below is example code of microphone streaming by python.

https://cloud.google.com/speech/docs/streaming-recognize#speech-streaming-recognize-python

score 0 · Answer 2 · answered Nov 20 '17 at 14:40

Check this link out:

https://github.com/Uberi/speech_recognition/blob/master/examples/microphone_recognition.py

This is an example for obtaining audio from the microphone. There are several components for the recognition process. In my experience the Sphinx Recognition lacks on accuracy. The Google Speech Recognition works very well.

score 0 · Answer 3 · answered Jul 12 '21 at 15:32

Working with Google Speech API for real-time transcription is a bit cumbersome. You can use this repository for inspiration https://github.com/saharmor/realtime-transcription

It transcribes client-side's microphone in real-time (disclaimer: I'm the author).

Google Cloud Speech API real time recognition

3 Answers3