I was reading this guide on speech recognition, and it mentioned that I need three items for speech recognition: Acoustic model, Language Model, Phonetic Dictionary.
I wanted to start playing with this python demo, which uses Gstreamer to capture from the mic and resample to 8kHz, 16-bit PCM audio.
I see that I can specify the language model and phonetic dictionary, and I use the one [provided by cmu]:
http://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English%20HUB4%20Language%20Model/
But I am confused where I should specify the acoustic model? Does gstreamer have its own acoustic model I'm implicitly using? I was hoping to use the acoustic model provided here for slightly better results:
http://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English%20HUB4%20Acoustic%20Model/
(Sorry about the hyperlinks. I can't post more than 2 links with rep less than 10)