I'm creating a small application that requires a live feed of phonemes to be output as the user speaks into their microphone. In my case, the speed of the recognition output is the number 1 priority, even over accuracy. Using C# is the preference, but if a better speed can be accomplished using a different language and/or library (Like CMUSphinx), I would switch.
Using System.Speech.Recognition
, along with DictationGrammar("grammar:dictation#pronunciation")
, I've been able to create a simple and effective phoneme recognizer that does output phonemes as you speak into the mic, with generally impressive accuracy (subscribing to the SpeechRecognitionEngine.SpeechHypothesized
event allows me to see live output). The problem is, it has a minimum delay of around .5s between the user speaking and the output which is too much to work well with the project. I know that in general this is a fairly high speed, especially considering the good accuracy, but I really need something faster, even if the accuracy takes a big hit. Is there any way to configure a SpeechRecognitionEngine
to throw accuracy out the window in order to spew out hypothesis faster? I found some exposed settings using SpeechRecognitionEngine.UpdateRecognizerSetting
, but they seem to have little effect on the output for phoneme recognition.
I've also looked into CMUSphinx, a free speech recognition project that looked promising. Sphinx4 was easy to compile and set up a test is Java, but I couldn't figure out how to configure it to live output phonemes, and it's word recognition was relatively slow. Here, I found some notes about phoneme recognition using their other project, pocketsphinx. I was able to also download and compile it, but unable to run any tests successfully. Has anyone use CMUSphinx or Pocketsphinx with phonemes? Is it capable of high, live output speeds? Or perhaps there is even more alternatives? I really am looking for something extremely basic, but fast.
Edit: Was able to get pocketsphinx recognizing phonemes, but it was too slow to use in the project