2

Im using the Gaussian Mixture Model to estimate loglikelihood function(the parameters are estimated by the EM algorithm)Im using Matlab...my data is of the size:17991402*1...17991402 data points of one dimension:

When I run gmdistribution.fit(X,2) I get the desired output

But when I run gmdistribution.fit(X,k) for k>2....the code crashes and I get the error"OUT OF MEMORY"..I have also tried an open source code which again gives me the same problem.Can someone help me out here?..Im basically looking for a code which will allow me to use different number of components on such a large dataset.

Thanks!!!

ashwin shanker
  • 303
  • 1
  • 7
  • 20

1 Answers1

0

Is it possible for you to decrease the iteration time? The default is 100.

OPTIONS = statset('MaxIter',50,'Display','final','TolFun',1e-6)
gmdistribution.fit(X,3,OPTIONS)

Or you may consider under-sampling the original data.

A general solution to out of memory problem is described in this document.

lennon310
  • 12,503
  • 11
  • 43
  • 61
  • Undersampling is a bit tough for me to do due to the nature of my data..I reduced the number of iterations but I get the same result..any further suggestions? – ashwin shanker Dec 27 '13 at 02:19
  • A general solution is from this document: http://www.mathworks.com/help/matlab/matlab_prog/resolving-out-of-memory-errors.html – lennon310 Dec 27 '13 at 04:58