0

I installed "sphinxbase" and "pocketsphinx" on windows and run the "PocketSphinxDemo" on eclipse and on my phone. Next i want to create Turkish language for this application.It is enough to understand a few words or sentences as beginning so that it could be easy. I could not found ready Turkish model on Voxforge. Is there any other website that i can find or any tool that i can create easily.

I used imtool but dic file pronunciation is english. How can i create dic file for turkish language.

g1904
  • 1
  • 2

1 Answers1

2

You need a list of words first of all. After that you can use espeak rules to create a phonetic dictionary:

espeak -v tr -x
Türkçe 
tYRktS'E

You only need to parse the output an put it in the dictionary in alpha-only format.You just neet create a map to letter-only phoneset, not necessary a map to arpabet. Open the text pad and create a map:

t t
y yy
r rr
k k
e ee
S' sh

So in the end you get entries like this:

türkçe t yy rr k t sh ee

That's it. There is no requirement to use ARPABet. For more details see the acoustic model training tutorial

Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87
  • I used espeak and generated ipa symbols of words.Now how can i convert them as CMU sphinx phoneset format ARPABet. – g1904 Apr 04 '13 at 18:22
  • There is no need to map to ARPAbet, you need just a random map to letter-only phoneset. I've updated the anwer with the example. – Nikolay Shmyrev Apr 05 '13 at 05:00
  • How can i map these symbols? big R means rr, small r means r, S' means sh as i understand. what about 'a or @ means etc.. how can i find them. – g1904 Apr 05 '13 at 17:40
  • You can select any reasonable mapping as long as they are unique. For example map a to a and @ to ax. Unleash your mind. – Nikolay Shmyrev Apr 05 '13 at 19:19