0

How to decide each GMM Component Weight

Regarding the Gaussian Mixture Model(GMM) for classification and clustering. the weight for each Gaussian component are arbitrarily set as an average of total numbers of components.

This is conducted as a common sense in most textbooks, papers and practical uses.

  1. Is there any theoretical work concerning this issue?
  2. Or, is it just a trivial problem indeed?

Any clues are welcome

1 Answers1

0

One sensible thing to do is to set the mixture weight to the prior probabilities, but in most cases I have seen mixture weight is a hidden variable and estimated through EM.

You could do a k-means clustering with k equal to the number of mixtures you want and initialize the weights proportionately. This is another way to go about it and it somewhat makes sense.

If you do know the mixture membership for some of your training data, you could use that and estimate the prior probability and use it to initialize your mixture weights, but I have never seen a case like that.

On a side note, there is no principled method to set the number of mixtures and I think the scientific community is pretty convinced there is none.

Neo M Hacker
  • 899
  • 7
  • 11
  • when GMM is used for clustering, the assignment for each sample is usually based on its maximum posterior, different weight would result in different clustering results. – pythonroar Apr 28 '13 at 05:03
  • i'm wondering what would be the optimal weights for the components, does the weight should be the same or proportional to number of components within each kmeans cell or any criterion? – pythonroar Apr 28 '13 at 05:08