2

I have to use EM to estimate the mean and covariance of the Gaussian distribution for each of the two classes. They have some missing attributes too.

Classes of each object is known. Therefore the problem basically reduces to fitting a gaussian model with missing element.

Which is the best library to use?

How is ECM Algorithm different from EM algorithm?

Alex Riley
  • 169,130
  • 45
  • 262
  • 238
damned
  • 935
  • 2
  • 19
  • 35

3 Answers3

4

If you have access to the Statistics Toolbox, you can use the GMDISTRIBUTION class to fit a Gaussian Mixture Model using the EM algorithm.

Here is an example:

%# sample dataset
load fisheriris
data = meas(:,1:2);
label = species;

%# fit GMM using EM
K = 2;
obj = gmdistribution.fit(data, K);

%# assign points to mixtures: argmax_k P(M(k)|data)
P = posterior(obj, data);
[~,mIDX] = max(P,[],2);

%# GMM components
obj.mu             %# means
obj.Sigma          %# covariances
obj.PComponents    %# mixture weights

%# visualize original data clusters
figure
gscatter(data(:,1), data(:,2), label)

%# visualize mixtures found
figure
gscatter(data(:,1), data(:,2), mIDX), hold on
ezcontour(@(x,y)pdf(obj,[x y]), xlim(), ylim())

enter image description here

If not, check out the excellent Netlab Toolbox, as it has GMM implementation.

Amro
  • 123,847
  • 25
  • 243
  • 454
  • I think this should be used if the classes are not known. Since I know the classes, it is better to fit gaussian to them separately. Hopefully latter would give better results in my case. – damned Sep 09 '11 at 12:14
1

Thanks all.But I am using ecmnmle for estimating parameters and then obtaining distribution of marginals which is used later in bayes classification. It works pretty fine with accuracies of 0.9 and 0.69 on 2 classes.

damned
  • 935
  • 2
  • 19
  • 35
0

Please take a look at the PMTK toolkit

Here is the EM implementation (Fit a mixture of Gaussians where the data may have NaN entries)

Sergey
  • 211
  • 1
  • 4
  • ??? Undefined function or method 'process_options' for input arguments of type 'cell'. Error in ==> mixGaussMissingFitEm at 12 [model.cpd.mu, model.cpd.Sigma, model.mixWeight, model.doMap, model.diagCov, EMargs] = ... – Oliver Amundsen Nov 05 '14 at 23:33