nlinfit appears better than fitgmdist for fitting normal mixture

Question

I have a vector of data consisting of about 2 million samples that I suspect is a mixture of two gaussian's. I try to fit the data, Data, to a mixture using matlab's fitgmdist.

From histogram:

% histogram counts of X with 1000 bins.
[Yhist, x] = histcounts(Data, 1000, 'normalization', 'pdf');
x = (x(1:end-1) + x(2:end))/2;

Using fitgmdist:

% Increase no. of iterations. default is 100.
opts.MaxIter = 300;

% Ensure that it does all iterations.
opts.TolFun = 0;

GMModel = fitgmdist(Data, 2, 'Options', opts, 'Start', 'plus');
wts = GMModel.ComponentProportion;
mu = GMModel.mu;
sig = sqrt(squeeze(GMModel.Sigma));
Ygmfit = wts(1)*normpdf(x(:), mu(1), sig(1)) + wts(2)*normpdf(x(:), mu(2), sig(2));

Mixture results with fitgmdist: wts = [0.6780, 0.322], mu = [-7.6444, -9.7831], sig = [0.8243, 0.5947]

Next I try using nlinfit:

% Define the callback function for nlinfit.
function y = nmmix(a, x)
   a(1:2) = a(1:2)/sum(a);
   y = a(1)*normpdf(x(:), a(3), a(5)) + a(2)*normpdf(x(:), a(4), a(6));
end

init_wts = [0.66, 1-0.66];
init_mu = [-7.7, -9.75];
init_sig = [0.5, 0.5];
a = nlinfit(x(:), Yhist(:), @nmmix, [init_wts, init_mu, init_sig]);
wts = a(1:2)/sum(a(1:2));
mu = a(3:4);
sig = a(5:6);
Ynlinfit = wts(1)*normpdf(x(:), mu(1), sig(1)) + wts(2)*normpdf(x(:), mu(2), sig(2));

Mixture results with nlinfit: wts = [0.6349, 0.3651], mu = [-7.6305, -9.6991], sig = [0.6773, 0.6031]

% Plot to compare the results
figure;
hold on
plot(x(:), Yhist, 'b');
plot(x(:), Ygmfit, 'k');
plot(x(:), Ynlinfit, 'r');

It seems to be that the non-linear fit (red curve) is intuitively a better approximation to the histogram (blue curve) than "fitgmdist" (black curve). The results are similar even if I use a finer histogram, say with 100,000 bins.

What can be the source of this discrepancy?

Added later: Of course one would not expect the results to be the same, but I would expect that the visual quality of the two fits would be comparable.

You use two different algorithms, so I'm not surprised at all to see that they yield different results. Check the mathematical working of both to see where the difference stems from. — Adriaan, Sep 01 '17 at 12:53
@adriaan, I'm not concerned about the difference, which is to be expected, but I'm surprised that nlinfit appears to give a better fit. — skm, Sep 01 '17 at 13:24

nlinfit appears better than fitgmdist for fitting normal mixture

0 Answers0