3

I am trying to build a speech recognition app based on CMU Sphinx. I have created my own language model using the lmtool. But inorder to improve the accuracy of recognition, I want to tune Sphinx. But are there any guidelines for choosing the properties like absoluteBeamWidth, relativeBeamWidth, absoulteWordBeamWidth, languageWeight. I am not exactly sure what these properties mean. Also any links to the resource(excluding the incomplete tuning link on the sphinx website) that can help me in tuning Sphinx will be appreciated.

Thank you

barryhunter
  • 20,886
  • 3
  • 30
  • 43
Shishya
  • 1,069
  • 1
  • 14
  • 22

1 Answers1

4

But inorder to improve the accuracy of recognition, I want to tune Sphinx.

Accuracy is not improved through tuning but by using better models and more advanced algorithms. See the FAQ for details:

http://cmusphinx.sourceforge.net/wiki/faq#qwhy_my_accuracy_is_poor

Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87
  • ok then why do we use the properties like absoluteBeamWidth, relativeBeamWidth, absoulteWordBeamWidth and languageWeight .... – Shishya Nov 17 '12 at 05:18
  • 4
    Speech recognition is essentially a search of the proper result. Beams restrict search by dropping the variants which are scored less than the best one. Relative beam width affects paths which score is beam times smaller. Absolute beam selects absolute amount of paths which are explored every frame. Word beams consider word endings at this particular frame, while just beams consider all paths. Smaller beam speedups search, wider beams make it slower. Language weight controls the effect of the language mode. It's usually selected during the experiment. – Nikolay Shmyrev Nov 17 '12 at 06:53
  • The default values are usually just right, their adjustment doesn't do any significant improvement. – Nikolay Shmyrev Nov 17 '12 at 06:54