1

i want to build a little phoneme based "dialog system" that listens to speech converts it into a string of phonemes (how ever wrong it doesn't matter), processes / stores these and plays them back on phoneme level. i aim to use either festival / mbrola with it or espeak. all running on a raspberry pi (the project is called babble pi).

i followed the really nice instructions here: https://wolfpaulus.com/jounal/embedded/raspberrypi2-sr/

and i also get a nice recognition with the command:

pocketsphinx_continuous -hmm /usr/local/share/pocketsphinx/model/en-us/en-us -lm 3199.lm -dict 3199.dic -samprate 16000/8000/48000 -inmic yes

now i've read this article about phoneme recognition here on the sourceforge site: http://cmusphinx.sourceforge.net/wiki/phonemerecognition

and also realised that obviously prealpha5 has a new binary format. the article about the phoneme recogniser states that basically the english phoneme recogniser is part of the default installation package and thus invites to test it via:

pocketsphinx_continuous -infile test/data/goforward.raw -hmm en-us -allphone model/en-us/en-us-phone.lm.dmp -backtrace yes -beam 1e-20 -pbeam 1e-20 -lw 2.0

i assume that the phoneme article refers to older versions of (pocket-)sphinx, since it's refering to the .dmp instead of the .bin file extention and so i tried:

pocketsphinx_continuous -infile test/data/goforward.raw -hmm en-us -allphone model/en-us/en-us-phone.lm.bin -backtrace yes -beam 1e-20 -pbeam 1e-20 -lw 2.0

but i got the following error:

ERROR: "acmod.c", line 83: Folder 'en-us' does not contain acoustic model definition 'mdef'

looking at en-us, there is in fact only a .dict, a .lm.bin and the phone file. and another en-us directory containing an mdef file as well as several others. copying it up doesn't help.

so, what to do? de-install prealpha5 and install version 4? or can i download the right file somewhere?

Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87
assadollahi
  • 11
  • 1
  • 3

1 Answers1

2

An argument to -hmm option which you set to en-us is a path to the folder. In your case it is a relative path. If lm path is model/en-us/en-us-phone.lm.bin, then -hmm path must be model/en-us/en-us, not simply en-us.

Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87
  • okay, so your hint was in the right direction, i guess. actually en-us/en-us didn't work but model/en-us/en-us led to the program run until "INFO: continuous.c(303): pocketsphinx_continuous COMPILED ON " and then it takes about 40 sec on the raspberry pi 2 and i get the phones list. so, YEAH it works! but: wow, is that slow! i've read that it loads 130k words, is that making it slow? could i trim the .dict file to the top 10k (assuming they are sorted by frequency)? – assadollahi Aug 06 '15 at 18:56
  • Phonetic recognition is slow because it considers enormous amount of variants. You can add a command line option `-allphone_ci yes` to command line arguments to make it faster but less accurate. Vocabulary of 130k words is irrelevant. – Nikolay Shmyrev Aug 06 '15 at 20:17