Speech Recongnition Using OpenVINO

Question

I want to implement a python project in which the input will be a .mp4 file and the output will be the transcript or subtitle of the video. The constraint is to use OpenVINO. How can I do that?

This may be useful: https://docs.openvinotoolkit.org/latest/_inference_engine_samples_speech_sample_README.html — crypdick, Jan 30 '20 at 17:56

score 1 · Accepted Answer · answered Feb 26 '20 at 14:06

mp4 is a container. I believe the current OpenVINO speech demo/samples use wav files as that is what the model is trained for.

If you can convert your mp3 or audio from the mp4 container using a tool to convert it to the wav format, that may work.

speech rec demo

Speech Recongnition Using OpenVINO

1 Answers1