0

On sphinx documentation it's written that adaptation of inbuilt acoustic model is same in both sphinx4 and pocket sphinx and there is an another documentation which is for sphinx4 adaptation.

But after adaptation how to transcribe the audio file?
In case of pocket sphinx we run the following command:

pocketsphinx_continuous -hmm en-us-adapt -lm en-us.lm.bin -dict cmudict-en-us.dict -infile 01.wav>1.txt

and it will transcribe an audio file in a text file but in the case of Sphinx4 how to transcribe an audio file into a text file. Is there any direct command for transcription of an audio file using Sphinx4.

Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87

1 Answers1

0

Tutorial says

To use the trained model in sphinx4, you need to update the model location in the code.

When you configure the model location in sphinx4 code you can point the path to your adapted model:

configuration.setAcousticModelPath("file:/home/pawan/sphinx4/adapted-model");

or if you placed the model in resouces

configuration.setAcousticModelPath("resource:/com/example/adapted-model");
Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87
  • Thank you Nikolay, – pawan sharma Jul 06 '16 at 05:16
  • I want to know one more thing that the the CMU sphinx acoustic model is already trained with some dataset. Which data CMU sphinx have used in default acoustic model training? Is it Computer generated or natural voice. I am asking this question because i want to train it with TIMIT LDC dataset for batter accuracy. So it will improve the accuracy or not for general US voice?? – pawan sharma Jul 06 '16 at 05:38
  • Timit is a small toy database, it will not improve accuracy. Current cmusphinx model is trained on much larger sets of data of natural voices from thousands speakers. – Nikolay Shmyrev Jul 06 '16 at 07:18
  • Finally i have to analyse the text so I need the accuracy about 70 to 80% for US native speakers, but pocket sphinx is giving me about 50 to 60% accuracy on clear US news recordings and youtube audios and when I tried computer generated voice it is giving me 95% accuracy why is this difference . Please tell me which current model I should use, which is stable and which platform pocketsphinx or sphinx4. – pawan sharma Jul 06 '16 at 10:48