0

I am currently using the gaussian mixture model to fit some data I have in matlab. I am using the gmdistribution.fit function, and have a question regarding the fit.

The following code is used to generate the PDF.

%Plot ECDFHIST
[ecdf_f,ecdf_x] = ecdf(X);
ecdfhist(ecdf_f,ecdf_x,25);    hold on; 

%Fit GMM
options = statset('Display','final');
obj = gmdistribution.fit(X,3,'Options',options);
gausspdf = pdf(obj, xaxis);

This example is a fit to one of my worst data sets:

pic

In short, my 3rd order GMM has 2 modes with a large standard deviation(SD), but the 3rd mode has a high peak, and small SD.

In the same way I can change the bin size with ECDFhist function, is there a way to change the options on the gmdist.fit(options), or similar, to increase my bin width (decrease number of bins)?

Any help would be greatly appreciated!!

Many Thanks, M

Matt
  • 12,848
  • 2
  • 31
  • 53
  • You can add the figure as link, someone with enough rep will edit the question to add it. – Matt Jun 26 '15 at 23:00
  • As suggested by @Matt here is a link to the image. This example is a fit to one of my worst data sets. The rest follow a good GMM of the 3rd order. As I mentioned above, I want to change the bin size so the middle mode here is not as narrow/tall. https://drive.google.com/file/d/0BynUatHhC5xDSTJfMUtDZmlRM0U/view?usp=sharing – MichaelD Jun 27 '15 at 12:55
  • I don't think gaussian mixture model estimation is bin-based. – A. Donda Jun 27 '15 at 14:39
  • Any idea how it estimates the fit then @A.Donda? Previously I was using the curve fitting toolbox (cftool) and fitting a guass3 to the ksdensity which was giving me an excellent fit, but that gave me the model parameters in terms of the amplitude, centroid location and width of each mode instead of a proper GMM with mixing coefficients. Ideally I would like to pass the ksdensity into the code I have shown above but that doesnt seem possible. Any ideas? Any help is greatly appreciated. @Matt thanks for adding the image :) – MichaelD Jun 27 '15 at 22:19
  • A GMM is fitted like any other probabilistic model: maximum likelihood, or Bayesian inference. For mixture models the problem is there are local optima of the likelihood, and the maximal likelihood can not be determined analytically. The most common solution is the expectation-maximization algorithm, an iterative numerical procedure. See https://en.wikipedia.org/wiki/Mixture_model#Expectation_maximization_.28EM.29 – A. Donda Jun 28 '15 at 11:52
  • The fit you show in the plot doesn't look too great, but I'd still expect this to be the best possible fit using a GMM with three components. Are you sure a GMM is the right model? For example, your data look like they are limited to an interval, which doesn't match well with a model that has infinite support. – A. Donda Jun 28 '15 at 11:55

0 Answers0