I have written the speech recognition application using CMU sphinx 4 and followed the details from this link. I have defined the Acoustic,Dictionary and Language Model as below
configuration.setAcousticModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us");
configuration.setDictionaryPath("resource:/edu/cmu/sphinx/models/en-us/cmudict-en-us.dict");
configuration.setLanguageModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us.lm.bin");
With the above configuration the 20 minutes of wav file takes almost close to 20 minutes to do the transcription.Hence than I tried to pass the user defined config.xml. I did n't find the configuration manager option to pass the user defined config.xml with the current version of Sphinx4.Then I had written own recognizer by extending the AbstractSpeechRecognizer.java
class(It may be useless) and changed few parameters of config.xml and I tried it but still no improvement.
I have downloaded video and audio across multiple source and converted into WAV file using FFMPEG
The command is as below
ffmpeg -i input.mp3 -acodec pcm_s16le -ac 1 -ar 16000 output.wav
Environment Details:
Java 8
Ubuntu 14.04
RAM 4GB
I5 Processor
What I would like to know is, what I am missing here and how to improve the performance?