0

I've got the .jar package that I've downloaded from here: http://sourceforge.net/projects/arabisc/

It comes as a zip file named Dialog.zip which has 2 folders lib, and bin.

In the lib folder i found the following packages:

an4.jar jsapi.jar sphinx4.jar

So i included the above packages, the .gram files, and the dialog.config.xml that found in bin folder in eclipse, then i edited the .gram files to have my own commands that i would like the engine to recognize but i get an error.

So my code is right and nothing is bad with it, the thing that I'm having problem with is that when i run my program my whole computer freezes i think that's because of not finding the commands on the dialog.config.xml maybe or something like that.

So when that i started looking at the files that i included to see if I'm missing something and i found those 2 file inside the an4.dict package it has all arabic letters:

an4.dic an4.filler

And inside the package an4.etc 3 files:

an4.1000.mdef an4.5000.mdef an4.ci.mdef

Any idea how can i add my own commands?

0x01Brain
  • 798
  • 2
  • 12
  • 28
  • What you are asking is unclear ? You said you got an error, then it was working then error again ? Can you post your error stacktrace and code ? – Sid Aug 12 '15 at 16:52
  • Sorry i wanted to mean that it freezes. – 0x01Brain Aug 12 '15 at 16:58
  • can you tell me how can i add my own commands, i have now the dictionary file as i think, what i need is make a config file (the xml file dialog.config.xml) as they had in their package Dialog.jar. – 0x01Brain Aug 12 '15 at 17:06

1 Answers1

0

It is better to train Arabic acoustic model from scratch, Arabisc is not really relevant.

Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87
  • Thanks for answering, your we're right i noticed that it's structured to few words which means the dictionary is small, but could you please tell me what exactly the steps that i should take to make my own commands, i have made the list of the commands and vocab words i want, generated the .dict file, and the .word file using lextool from here: http://www.speech.cs.cmu.edu/tools/lextool.html – 0x01Brain Aug 13 '15 at 04:24
  • I have a second question also, do i have to make the words in UTF-8 format or ASCII (The way how it should be prounanced in english)? – 0x01Brain Aug 13 '15 at 04:37
  • Lextool supports English only, not Arabic, you have to create arabic dictionary yourself. Words in the dictionary could be UTF-8, but phonemes must be ASCII (ideally just letters). – Nikolay Shmyrev Aug 13 '15 at 13:01
  • So just to be clear, i have to make .dict file in UTF-8 format, and the .phon as ASCII, but i noticed that when i just upload words.txt that is in UTF-8 format the Imtool generates the following filetypes: .lm .dict .sent .vocab .log_pronounce I have followed the guide but it doesn't tell something about those file what should contain in them, same with the .phon file (phonemes), so the .phon file should contain on them the letters how it should be pronounced? Thanks – 0x01Brain Aug 13 '15 at 17:41
  • I never wrote about phone file. Phones in dictionary must be in ASCII, words in UTF-8, you can check French dictionary for example. LMtool does not work for other languages except English, you should not use it, I already wrote that above, I will not repeat again. – Nikolay Shmyrev Aug 13 '15 at 20:51