1

I have edited the dialog code to make it work for my project.

  1. I have created a text file with some of the possible sentences to be used in my work. I added the link in the comment section.
  2. I have followed the steps on http://cmusphinx.sourceforge.net/wiki/tutoriallm to build my language model using web service.
  3. then, I edited the dialog code to be:

    package dialog;
    
    import edu.cmu.sphinx.api.Configuration;
    import edu.cmu.sphinx.api.SpeechResult;
    import edu.cmu.sphinx.api.StreamSpeechRecognizer;
    import java.io.File;
    import java.io.FileInputStream;
    import java.io.InputStream;
    
    public class EmployeeCode {
    
    private static final String ACOUSTIC_MODEL = "resource:/edu/cmu/sphinx/models/en-us/en-us";
    private static final String DICTIONARY_PATH = "models/language/TAR0779/0779.dic";
    private static final String LANGUAGE_MODEL = "models/language/TAR0779/0779.lm";
    
    
    public static void main(String[] args) throws Exception {
    
        System.out.println("Loading models...");
    
        Configuration configuration = new Configuration();
        configuration.setAcousticModelPath(ACOUSTIC_MODEL);
        configuration.setDictionaryPath(DICTIONARY_PATH);
        configuration.setLanguageModelPath(LANGUAGE_MODEL);
    
        StreamSpeechRecognizer lmRecognizer = new StreamSpeechRecognizer(configuration);
    
        InputStream stream = new FileInputStream(new File("/Users/ha/NetBeansProjects/Dialog/WAV/sample1.wav"));
    
        lmRecognizer.startRecognition(stream);
        SpeechResult result;    
    
        while ((result = lmRecognizer.getResult()) != null)
        {
            System.out.println("You said: " + result.getHypothesis() + '\n');
        } /* else
        {
            System.out.println("There is no stream.");  
        } */
    
        lmRecognizer.stopRecognition();
    
    }
    

    }

  4. after run the output is:

    run: Loading models... Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: *+NSN+ Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: *+SPN+ Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: AA Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: AE Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: AH Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: AO Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: AW Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: AY Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: B Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: CH Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: D Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: DH Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: EH Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: ER Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: EY Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: F Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: G Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: HH Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: IH Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: IY Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: JH Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: K Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: L Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: M Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: N Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: NG Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: OW Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: OY Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: P Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: R Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: S Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: SH Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: T Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: TH Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: UH Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: UW Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: V Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: W Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: Y Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: Z Apr 16, 2015 2:04:10 PM edu.cmu.sphinx.linguist.acoustic.UnitManager getUnit INFO: CI Unit: ZH Apr 16, 2015 2:04:11 PM edu.cmu.sphinx.frontend.AutoCepstrum initDataProcessors INFO: Cepstrum component auto-configured as follows: autoCepstrum {MelFrequencyFilterBank, Denoise, DiscreteCosineTransform2, Lifter} Apr 16, 2015 2:04:11 PM edu.cmu.sphinx.linguist.dictionary.TextDictionary allocate INFO: Loading dictionary from: file:models/language/TAR0779/0779.dic Apr 16, 2015 2:04:12 PM edu.cmu.sphinx.linguist.dictionary.TextDictionary allocate INFO: Loading filler dictionary from: jar:file:/Users/ha/Downloads/sphinx4-data-1.0-20150223.210601-7-sources.jar!/edu/cmu/sphinx/models/en-us/en-us/noisedict Apr 16, 2015 2:04:12 PM edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader logInfo INFO: Loading tied-state acoustic model from: jar:file:/Users/ha/Downloads/sphinx4-data-1.0-20150223.210601-7-sources.jar!/edu/cmu/sphinx/models/en-us/en-us Apr 16, 2015 2:04:12 PM edu.cmu.sphinx.linguist.acoustic.tiedstate.Pool logInfo INFO: Pool means Entries: 16128 Apr 16, 2015 2:04:12 PM edu.cmu.sphinx.linguist.acoustic.tiedstate.Pool logInfo INFO: Pool variances Entries: 16128 Apr 16, 2015 2:04:12 PM edu.cmu.sphinx.linguist.acoustic.tiedstate.Pool logInfo INFO: Pool transition_matrices Entries: 42 Apr 16, 2015 2:04:12 PM edu.cmu.sphinx.linguist.acoustic.tiedstate.Pool logInfo INFO: Pool senones Entries: 5126 Apr 16, 2015 2:04:12 PM edu.cmu.sphinx.linguist.acoustic.tiedstate.GaussianWeights logInfo INFO: Gaussian weights: mixture_weights. Entries: 15378 Apr 16, 2015 2:04:12 PM edu.cmu.sphinx.linguist.acoustic.tiedstate.Pool logInfo INFO: Pool senones Entries: 5126 Apr 16, 2015 2:04:12 PM edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader logInfo INFO: Context Independent Unit Entries: 42 Apr 16, 2015 2:04:12 PM edu.cmu.sphinx.linguist.acoustic.tiedstate.HMMManager logInfo INFO: HMM Manager: 137095 hmms Apr 16, 2015 2:04:12 PM edu.cmu.sphinx.linguist.acoustic.tiedstate.TiedStateAcousticModel logInfo INFO: CompositeSenoneSequences: 0 Apr 16, 2015 2:04:13 PM edu.cmu.sphinx.linguist.acoustic.HMMPool dumpInfo INFO: Max CI Units 43 Apr 16, 2015 2:04:13 PM edu.cmu.sphinx.linguist.acoustic.HMMPool dumpInfo INFO: Unit table size 79507 Apr 16, 2015 2:04:13 PM edu.cmu.sphinx.util.TimerPool showTimesShortTitle INFO: # ----------------------------- Timers---------------------------------------- Apr 16, 2015 2:04:13 PM edu.cmu.sphinx.util.TimerPool showTimesShortTitle INFO: # Name Count CurTime MinTime MaxTime AvgTime TotTime
    Apr 16, 2015 2:04:13 PM edu.cmu.sphinx.util.Timer showTimesShort INFO: Load AM 1 3.0410s 3.0410s 3.0410s 3.0410s 3.0410s
    Apr 16, 2015 2:04:13 PM edu.cmu.sphinx.util.Timer showTimesShort INFO: Load Dictionary 1 0.0520s 0.0520s 0.0520s 0.0520s 0.0520s
    Apr 16, 2015 2:04:13 PM edu.cmu.sphinx.util.Timer showTimesShort INFO: Compile 1 1.8290s 1.8290s 1.8290s 1.8290s 1.8290s
    Apr 16, 2015 2:04:17 PM edu.cmu.sphinx.instrumentation.SpeedTracker showAudioUsage INFO: This Time Audio: 0.95s Proc: 3.15s Speed: 3.32 X real time Apr 16, 2015 2:04:17 PM edu.cmu.sphinx.instrumentation.SpeedTracker showAudioSummary INFO: Total Time Audio: 0.95s Proc: 3.15s 3.32 X real time Apr 16, 2015 2:04:17 PM edu.cmu.sphinx.instrumentation.MemoryTracker calculateMemoryUsage INFO: Mem Total: 212.50 Mb Free: 70.12 Mb Apr 16, 2015 2:04:17 PM edu.cmu.sphinx.instrumentation.MemoryTracker calculateMemoryUsage INFO: Used: This: 142.38 Mb Avg: 142.38 Mb Max: 142.38 Mb You said: WHAT IS

    Apr 16, 2015 2:04:20 PM edu.cmu.sphinx.instrumentation.SpeedTracker showAudioUsage INFO: This Time Audio: 0.96s Proc: 2.45s Speed: 2.55 X real time Apr 16, 2015 2:04:20 PM edu.cmu.sphinx.instrumentation.SpeedTracker showAudioSummary INFO: Total Time Audio: 1.91s Proc: 5.60s 2.93 X real time Apr 16, 2015 2:04:20 PM edu.cmu.sphinx.instrumentation.MemoryTracker calculateMemoryUsage INFO: Mem Total: 237.00 Mb Free: 141.00 Mb Apr 16, 2015 2:04:20 PM edu.cmu.sphinx.instrumentation.MemoryTracker calculateMemoryUsage INFO: Used: This: 96.00 Mb Avg: 119.19 Mb Max: 142.38 Mb You said: MANY MEN

    Apr 16, 2015 2:04:21 PM edu.cmu.sphinx.instrumentation.SpeedTracker showAudioUsage INFO: This Time Audio: 1429182208.00s Proc: 1.19s Speed: 0.00 X real time Apr 16, 2015 2:04:21 PM edu.cmu.sphinx.instrumentation.SpeedTracker showAudioSummary INFO: Total Time Audio: 1429182208.00s Proc: 6.79s 0.00 X real time Apr 16, 2015 2:04:21 PM edu.cmu.sphinx.instrumentation.MemoryTracker calculateMemoryUsage INFO: Mem Total: 247.50 Mb Free: 144.35 Mb Apr 16, 2015 2:04:21 PM edu.cmu.sphinx.instrumentation.MemoryTracker calculateMemoryUsage INFO: Used: This: 103.15 Mb Avg: 113.84 Mb Max: 142.38 Mb You said: MANY

    Apr 16, 2015 2:04:21 PM edu.cmu.sphinx.util.TimerPool showTimesShortTitle INFO: # ----------------------------- Timers---------------------------------------- Apr 16, 2015 2:04:21 PM edu.cmu.sphinx.util.TimerPool showTimesShortTitle INFO: # Name Count CurTime MinTime MaxTime AvgTime TotTime
    Apr 16, 2015 2:04:21 PM edu.cmu.sphinx.util.Timer showTimesShort INFO: Load AM 1 3.0410s 3.0410s 3.0410s 3.0410s 3.0410s
    Apr 16, 2015 2:04:21 PM edu.cmu.sphinx.util.Timer showTimesShort INFO: Load Dictionary 1 0.0520s 0.0520s 0.0520s 0.0520s 0.0520s
    Apr 16, 2015 2:04:21 PM edu.cmu.sphinx.util.Timer showTimesShort INFO: Score 586 0.0000s 0.0000s 0.2270s 0.0031s 1.8140s
    Apr 16, 2015 2:04:21 PM edu.cmu.sphinx.util.Timer showTimesShort INFO: Prune 2043 0.0000s 0.0000s 0.0020s 0.0000s 0.0280s
    Apr 16, 2015 2:04:21 PM edu.cmu.sphinx.util.Timer showTimesShort INFO: Grow 2051 0.0000s 0.0000s 0.9200s 0.0025s 5.1330s
    Apr 16, 2015 2:04:21 PM edu.cmu.sphinx.util.Timer showTimesShort INFO: Frontend 298 0.0000s 0.0000s 0.2100s 0.0009s 0.2640s
    Apr 16, 2015 2:04:21 PM edu.cmu.sphinx.util.Timer showTimesShort INFO: Compile 1 1.8290s 1.8290s 1.8290s 1.8290s 1.8290s
    Apr 16, 2015 2:04:21 PM edu.cmu.sphinx.instrumentation.SpeedTracker showAudioSummary INFO: Total Time Audio: 1429182208.00s Proc: 6.79s 0.00 X real time Apr 16, 2015 2:04:21 PM edu.cmu.sphinx.instrumentation.MemoryTracker calculateMemoryUsage INFO: Mem Total: 247.50 Mb Free: 141.87 Mb Apr 16, 2015 2:04:21 PM edu.cmu.sphinx.instrumentation.MemoryTracker calculateMemoryUsage INFO: Used: This: 105.63 Mb Avg: 111.79 Mb Max: 142.38 Mb BUILD SUCCESSFUL (total time: 28 seconds)

The correct result should be: what is the minimum salary.

my wav file is: https://www.mediafire.com/?khgyc9bhltz0z3b

How can I improve the accuracy of my wav file?

Thanks in advance

1 Answers1

1

private static final String ACOUSTIC_MODEL = "models/acoustic/wsj";

This is wrong, you need to use default en-us model

I have deleted a lot of lines of missing a phonetic transcription for words in my corpus

The corpus must be a text file, not RTF file. You need to try to create language model and dictionary again.

Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87
  • Thank you for your reply @Nikolay-Shmyrev. I have edited the code in my question and the output have changed too. can you please help me? – user3246661 Apr 11 '15 at 07:07
  • I have no idea what you corrupted there, I suggest you to start from a clean demo again and make your changes. – Nikolay Shmyrev Apr 11 '15 at 07:22
  • Thank you @nikolay-shmyrev. The error raised after I added en-us model as an acoustic model. also I changed the dictionary path. when I but them back as: private static final String ACOUSTIC_MODEL = "models/acoustic/wsj";private static final String DICTIONARY_PATH = "models/acoustic/wsj/dict/cmudict.0.6d";private static final String LANGUAGE_MODEL = "models/language/TAR7772/7772.lm"; The code works with the very law accuracy. Any help please? – user3246661 Apr 11 '15 at 08:49
  • You are not using latest sphinxtrain as suggested by tutorial http://cmusphinx.sourceforge.net/wiki/tutorialsphinx4, you are not using latest models from that sphinxtrain – Nikolay Shmyrev Apr 11 '15 at 09:49
  • I am using the latest version of them. what shall I change? – user3246661 Apr 11 '15 at 11:14
  • Those are wrong. You can find link to the proper one in tutorial – Nikolay Shmyrev Apr 11 '15 at 11:31
  • Thank you. Here is the new jars: http://i.stack.imgur.com/MhYjV.png . I still get the same output, what shall I do. I really need to do it today so sorry for the inconvenience :$ – user3246661 Apr 11 '15 at 13:56
  • I have no idea what sphinx-0.7 is doing in your classpath. It is not correct jar. You also need to update with the code you have and result you get. – Nikolay Shmyrev Apr 11 '15 at 18:50
  • and deleted sphinx-0.7 jar – user3246661 Apr 11 '15 at 19:46
  • You need to remove punctuation (dots) from corpus before lm training. – Nikolay Shmyrev Apr 11 '15 at 19:57
  • I have edited the code and the output in my question. We are now back to my main question, improving the accuracy. I'm using sphinx in my master thesis and with this result it will give a very law percentage which I believe it does not. Any help? – user3246661 Apr 12 '15 at 08:52
  • You need to add a loop to retrieve multiple results, not just the first result. Then it will decode not just the beginning of your phrase – Nikolay Shmyrev Apr 12 '15 at 13:00
  • I did add the loop but nothing has improved. if I wanted to refer to some methods to improve the accuracy in my thesis what do you suggest? – user3246661 Apr 12 '15 at 16:56
  • You need to learn to be more precise. What do you mean by "nothing has improved"? What is the result? – Nikolay Shmyrev Apr 12 '15 at 17:12
  • I'm sorry. I did edited the question and the output. Thanks in advance – user3246661 Apr 16 '15 at 10:09
  • Your loop is wrong. See transcriber demo source to see how to make a loop. – Nikolay Shmyrev Apr 16 '15 at 10:22
  • I edited the loop and the result again. Still the result are not what I expect. Thank you very much for your reply. – user3246661 Apr 16 '15 at 11:07
  • I will appreciate your reply very much... Thanks – user3246661 Apr 17 '15 at 16:50