I'm constructing an Hidden Markov Model to identify whether someone is saying either "Yes" or "No". I have developed the Hidden Markov Model and I have come across a tutorial from this page:
http://www.cslu.ogi.edu/tutordemos/nnet_recog/recog.html
And in this tutorial it says:
This figure traces the search paths for "yes" and "no" through a hypothetical matrix of probabilities. Even though the score for "no" is very low, it is still possible to find the most probable path for this word, if "yes" had not been in our vocabulary. The Viterbi search can be understood by reading through the following pseudo-code algorithm (with notation borrowed from Rabiner's paper, A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition):
I have read through both of the papers and I am still confused by where they say:
through a hypothetical matrix of probabilities
My questions is where does this Matrix of probabilities come from? For example, I have done the follow:
- Read in the Audio File
- Stripped the Audio signals that do not warrant consideration
- Split the signals that warrent consideration into blocks
This means that I am left with blocks that contain the Phonemes. I have computed the Zero-crossings of the data, and, thus brings me to my point:
For "No" the data from this is very low,
For "Yes" the data from this is very high.
So in the example (given above) it says:
Even though the score for "no" is very low,
So could I just pass in the results from the zero-crossings as my probabilities? I'm confused and hope someone can help me with this.