What should I use between CMU Pocketsphinx and CMU Sphinx4 to get subtitles from video files?

Question

I would like to extract subtitles from video files eventually.

Current video files are located on physical disk, so they will be considered as train/test data. But imagine, that I have running web-app where I upload the fresh video and my web-app should during on-load time extract subtitles etc. I want to make it as much accurate as one of this decoder can :) Please advise.

score 3 · Answer 1 · answered Oct 18 '16 at 11:37

3

You need to use Kaldi

With implementation of modern algorithms for speech recognition (deep neural networks and WFST search) Kaldi is much more accurate (> 50%) and much faster. Neither of those implemented in sphinx4 or pocketsphinx.

answered Oct 18 '16 at 11:37

Nikolay Shmyrev

24,897
5
43
87

Wow, I haven't got familiar with Sphinx architecture well, but I got that acoustic models are based on Hidden Makarov models though. Thanks, I will take a look at Kaldi, but then my obvious question is - what is CMU Sphinx's competition? But I guess, I should ask another question on "Sphinx vs Kaldi". Thanks again – Novitoll Oct 18 '16 at 11:47
You can ask such question, but not on stackoverflow. Questions to recommend or find a tool are not welcome here. – Nikolay Shmyrev Oct 18 '16 at 12:02

What should I use between CMU Pocketsphinx and CMU Sphinx4 to get subtitles from video files?

1 Answers1