How do I have to train a HMM with Baum-Welch and multiple observations?

Question

I am having some problems understanding how the Baum-Welch algorithm exactly works. I read that it adjusts the parameters of the HMM (the transition and the emission probabilities) in order to maximize the probability that my observation sequence may be seen by the given model.

However, what does happen if I have multiple observation sequences? I want to train my HMM against a huge lot of observations (and I think this is what is usually done).

ghmm for example can take both a single observation sequence and a full set of observations for the baumWelch method.

Does it work the same in both situations? Or does the algorithm have to know all observations at the same time?

score 9 · Accepted Answer · answered May 30 '14 at 15:59

In Rabiner's paper, the parameters of GMMs (weights, means and covariances) are re-estimated in the Baum-Welch algorithm using these equations:

enter image description here

These are just for the single observation sequence case. In the multiple case, the numerators and denominators are just summed over all observation sequences, and then divided to get the parameters. (this can be done since they simply represent occupation counts, see pg. 273 of the paper)

So it's not required to know all observation sequences during an invocation of the algorithm. As an example, the HERest tool in HTK has a mechanism that allows splitting up the training data amongst multiple machines. Each machine computes the numerators and denominators and dumps them to a file. In the end, a single machine reads these files, sums up the numerators and denominators and divides them to get the result. See pg. 129 of the HTK book v3.4

How do I have to train a HMM with Baum-Welch and multiple observations?

1 Answers1