How to do Language Modeling using HTK

Question

I am in confusion on how to use HTK for Language Modeling. I followed the tutorial example from the Voxforge site

http://www.voxforge.org/home/dev/acousticmodels/linux/create/htkjulius/tutorial

After training and testing I got around 78% accuracy. I did this for my native language.Now I have to use HTK for Language Modeling.

Is there any tutorial available for doing the same? Please help me.

Thanks speech_tri

score 1 · Answer 1 · answered Jan 21 '17 at 08:46

If I understand your question correctly, you are trying to change from a "grammar" to an "n-gram language model" approach. These two methods are alternative ways of specifying what combinations of words are permissible in the responses that a recognizer will return. Having followed the Voxforge process you will probably have a grammar in place.

A language model comes from the analysis of a corpus of text which defines the probabilities of words appearing together. The text corpus used can be very specialized. There are a number of analysis tools such as SRILM (http://www.speech.sri.com/projects/srilm/) and MITLM (https://github.com/mitlm/mitlm) which will read a corpus and produce a model.

Since you are using words from your native language you will need a unique corpus of text to analyze. One way to get a test corpus would be to artificially generate a number of sentences from your existing grammar and use that as the corpus. Then with the new language model in place, you just point the recognizer at it instead of the grammar and hope for the best.

How to do Language Modeling using HTK

1 Answers1