Processing a live video from Youtube for speech to text

Question

I'd like to apply the Google Cloud Speech to Text API to a Youtube live video in order to transcribe text and apply some functions to highlight some parts of the transcribed text.

I've been reading both the Google Cloud Speech to Text API and the Youtube API but I found no proper example of how to do this.

All the examples refer to other inputs like processing not live videos (previously converting the youtube stream to a video file like avi) or a microphone connected to a device.

Do you know if there's a way to do this? Do you have any example on how to approach this?

score 1 · Accepted Answer · answered Feb 25 '19 at 21:48

In Google Speech to Text API, there is a way. It's called "StreamingRecognize." You need to feed the stream to it, and it'll get you the transcription back. You can customize it pretty extensively.

https://cloud.google.com/speech-to-text/docs/streaming-recognize#speech-streaming-recognize-python

You just have to find a way to reliably get the stream from youtube.

Processing a live video from Youtube for speech to text

1 Answers1