Speech Recognition using CMU Shinx, JSAPI and Google Speech API

Question

Speech recognition is one of the many features of my current project which will be most probably developed in J2EE (other languages are also welcomed if their choice is justified).

Most of the links at google and on SO suggest the above mentioned three options, Sphinx 4, JSAPI directly and Google Speech API (making a server call to google and than getting the result as text).

What are the other freely available options for me ? And If I use Sphinx-4 how do I get the language model for general English to be used with it ?

score 3 · Accepted Answer · edited Dec 29 '11 at 17:58

3

Yes, there are.

It is possible to use a wrapper to Google Speech Recognizer that is basic a line of code. You send speech audio in FLAC or SPEEX format and receive recognition and a confidence score. The only problem is that Google can close API as did with Google translate.
Other option is to use Sphinx (Sphinx4 or Pocketsphinx).
It is possible to use HTK (http://htk.eng.cam.ac.uk/) and use HVite (HTK decoder) or other like Julius (http://julius.sourceforge.jp/en/). There are other options that use HTK to train acoustic models and/or language and grammar.

Voxforge has acoustic and language models for HTK and Sphinx (http://voxforge.org/).

edited Dec 29 '11 at 17:58

Bo Persson

90,663
31
146
203

answered Dec 29 '11 at 16:52

Luis Uebel

54
1

This answer is misleading. HTK is in C and is not suitable for J2EE. It's also not free to use in applications. Voxforge doesn't provide language models neither for HTK nor for Sphinx. – Nikolay Shmyrev Dec 30 '11 at 11:30
if VoxForge doesnt support Sphinx than why they provide models see this link.. http://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/ which of this I should use for building a Dictation Application for Sphinx4..... – Amit Jan 18 '12 at 11:35

score 2 · Answer 2 · answered Jan 04 '12 at 22:57

2

And If I use Sphinx-4 how do I get the language model for general English to be used with it ?

You can download them from CMUSphinx website and from other places. You can also build them yourself. One of the possible locations are

http://www.keithv.com/software/csr/

answered Jan 04 '12 at 22:57

Nikolay Shmyrev

24,897
5
43
87

which version I should download from the above link.... can you plz explain steps 3,4 and 5 of readme.txt.... ? How can I use these models for building a dictation application ? – Amit Jan 18 '12 at 11:31

Speech Recognition using CMU Shinx, JSAPI and Google Speech API

2 Answers2