4

I am very new to matlab, hidden markov model and machine learning, and am trying to classify a given sequence of signals. Please let me know if the approach I have followed is correct:

  1. create a N by N transition matrix and fill with random values which sum to 1for each row. (N will be the number of states)
  2. create a N by M emission/observation matrix and fill with random values which sum to 1 for each row
  3. convert different instances of the sequence (i.e each instance will be saying the word 'hello' ) into one long stream and feed each stream to the hmm train function such that:

    new_transition_matrix old_transition_matrix = hmmtrain(sequence,old_transition_matrix,old_emission_matrix)

  4. give the final transition and emission matrix to hmm decode with an unknown sequence to give the probability i.e [posterior_states logrithmic_probability] = hmmdecode( sequence, final_transition_matrix,final_emission_matris)

Ahmed-Anas
  • 5,471
  • 9
  • 50
  • 72
  • Have U done it, I also need help regarding HMM. Which toolbox U have used? –  May 14 '14 at 12:31
  • Like the answer stated below, i used Murphys toolbox but used HMM with Gaussian outputs. you can see the tutorial here http://www.cs.ubc.ca/~murphyk/Software/HMM/hmm_usage.html – Ahmed-Anas May 15 '14 at 17:14
  • I'v seen this, I'v some queries regarding HMM, is HMM generates a trained file like `.xml` in Neural networks? I want to train trajectories with HMM, having X,Y,Z co-ordinates, What will be sequences as I think my states are every new row of changing state position. –  May 16 '14 at 06:14

1 Answers1

3

1. and 2. are correct. You have to be careful that your initial transition and emission matrices are not completely uniform, they should be slightly randomized for the training to work.

3. I would just feed in the 'Hello' sequences separately rather than concatenating them to form a single long sequence.

Let's say this is the sequence for Hello: [1,0,1,1,0,0]. If you form one long sequence from 3 'Hello' sequences, you would get:

data = [1,0,1,1,0,0,1,0,1,1,0,0,1,0,1,1,0,0]

This is not ideal, instead you should feed the sequences in separately like:

data = [1,0,1,1,0,0; 1,0,1,1,0,0; 1,0,1,1,0,0].

Since you are using MatLab, I would recommend using the HMM toolbox by Murphy. It has a demo on how you can train an HMM with multiple observation sequences:

M = 3;
N  = 2;

% "true" parameters
prior0 = normalise(rand(N ,1));
transmat0 = mk_stochastic(rand(N ,N ));
obsmat0 = mk_stochastic(rand(N ,M));

% training data: a 5*6 matrix, e.g. 5 different 'Hello' sequences of length 6
number_of_seq = 5;
seq_len= 6;
data = dhmm_sample(prior0, transmat0, obsmat0, number_of_seq, seq_len);

% initial guess of parameters
prior1 = normalise(rand(N ,1));
transmat1 = mk_stochastic(rand(N ,N ));
obsmat1 = mk_stochastic(rand(N ,M));

% improve guess of parameters using EM
[LL, prior2, transmat2, obsmat2] = dhmm_em(data, prior1, transmat1, obsmat1, 'max_iter', 5);
LL

4. What you say is correct, below is how you calculate the log probaility in the HMM toolbox:

% use model to compute log[P(Obs|model)]
loglik = dhmm_logprob(data, prior2, transmat2, obsmat2)

Finally: Have a look at this paper by Rabiner on how the mathematics work if anything is unclear.

Hope this helps.

Zhubarb
  • 11,432
  • 18
  • 75
  • 114
  • Can you tell me what exactly is prior0? And thanks for the answer. really appreciated. – Ahmed-Anas Oct 19 '13 at 07:28
  • 1
    You actually do not need it in your case. `prior0`, `transmat0` and `obsmat0` are used in data generation step. They represent the real model (as generated by the machine learner) so that after training the system, the experimenter can compare the learned parameters `prior2`, `transmat2` and `obsmat2`, to those that actually created the data. `prior0` in specific gives the probability of the system being in a state in the very beginning, e.g. 'does the model start at state1 or state2?' – Zhubarb Oct 20 '13 at 09:30
  • thanks a lot. But I'm looking at 'HMMs with mixture of Gaussians outputs' right now since I require vector, but I am having trouble understanding this line "Now let use fit a mixture of M=2 Gaussians ". What exactly does M represent? I am studying gaussian right now but I cant figure out what M does. – Ahmed-Anas Oct 20 '13 at 09:36
  • 1
    In a typical MoG application, sample data are thought of as originating from various possible sources (in your case M), and the data from each particular source is modelled by a Gaussian. Google and Wikipedia should help a lot in this. For example this [link](http://cseweb.ucsd.edu/~dasgupta/papers/mog.pdf). – Zhubarb Oct 21 '13 at 07:29
  • @Zhubarb I'm bit bit confusing in `hmmtrain` or training process of HMM, however I want to know if some `.xml` generated for track past record or generating the probabilities at run time, when specifically using EM training? –  May 14 '14 at 14:43