2

I have data from an image in MATLAB and I would like to decompose it into a gaussian mixture. The counts and binLocations of the image are stored in 256x2 matrix 'X' and fitgmdist(X, 3) gives parameters for three gaussians. I would like to plot these gaussians and find their intersections.

I am also a bit confused why parameters such as 'mu' in my mixture model have two columns. Shouldn't each gaussian have one corresponding mean?

1 Answers1

6

I tried to solve the problem by using the pout.tif image:

enter image description here

Here are the plots:

enter image description here

Here is the code:

I = imread('pout.tif');
imshow(I);

[counts,binLocations] = imhist(I);

subplot(3, 1, 1);
stem(binLocations, counts, 'MarkerSize', 1 );
xlim([50 200]);

X = I(:);
options = statset('MaxIter', 300); % default value is 100. Sometimes too few to converge

gm = gmdistribution.fit(double(X),3, 'Options', options);

subplot(3, 1, 2);
plot(binLocations, pdf(gm,binLocations));
xlim([50 200]);

subplot(3, 1, 3);
for j=1:3
    line(binLocations,gm.PComponents(j)*normpdf(binLocations,gm.mu(j),sqrt(gm.Sigma(j))),'color','r');
end
xlim([50 200]);

UPDATE

Here are some explanations to the code:

The first plot shows the histogram: color codes and their numbers in the image. It is just to demonstrate the frequency of each color. Later on it can be seen that the pdf plots resemble the histogram profile - a good validation means.

The second plot shows the pdf for the color codes. The model describes the real distribution by means of an approximation as a sum of k=3 normal distributions:

enter image description here

Sometimes the algorithm cannot find an appropriate approximation and ends up with only 2 distributions. You need to restart the code again. The area under the curve is equal to 1. In order to hold this constraint true, the pdf-components are scaled with parameters p.

On the third plot the single distributions are depicted. They are continous functions of x. You can solve them symbolically for x and they will look like:

f1 = 0.0193*exp(-0.0123*(x - 97.6)^2)

f2 = 0.0295*exp(-0.0581*(x - 83.0)^2)

f3 = 0.0125*exp(-0.00219*(x - 131.0)^2)

In order to find their intersections you can construct a symbolic function, put the corresponding parameters for each distribution, write an equation, and then use solve function to solve the equation for x:

syms x m s p
f = symfun(p/(s*sqrt(2*pi))*exp(-((x-m)^2)/(2*(s^2))), [x, m, s, p]);

for j = 1:3
    ff(j) = f(x, gm.mu(j), sqrt(gm.Sigma(j)), gm.PComponents(j));
end

eqn = ff(1) == ff(2);
sols = solve(eqn, x);

display(vpa(sols, 3));

The above code finds intersections of the 1st and 2nd pdfs. The result: x=88.1, x = 70.0.

UPDATE

Using the simbolic equation above you can find more about your pdfs:

% At which value of x the 3rd pdf is equal to 0.01?
eqn = ff(3) == 0.01;
sols = solve(eqn, x);
display(vpa(sols, 3)); %returns 121.0 and 141.0

% What is the value of the 3rd pdf at x = 141?
sol = subs(ff(3), x, 141);
display(vpa(sol, 3)); %returns 0.0101
Anton
  • 4,544
  • 2
  • 25
  • 31
  • Hm, I did something similar. Two questions - why do you apply gmdistribution.fit to the whole image rather than just the count values? Also, why do you have to create a pdf of the GMM? – Priyansh Sharma Dec 29 '15 at 23:08
  • According to the documentation the function works with the data itself and not with its statistical properties, does not it? It computes 'Gaussian mixture model with k components for data in the n-by-d matrix X, where n is the number of observations and d is the dimension of the data'. In your case X is 256x2, so the function interprets the second dimension as another observation of the data. – Anton Dec 29 '15 at 23:18
  • Why do you initially plot a pdf of the gm and then later a normpdf? Also is there a way to make these distributions continuons rather than evaluated at discrete points? – Priyansh Sharma Dec 30 '15 at 18:33
  • I updated my answer. I showed also how to find the intersections. I hope it can help. – Anton Dec 30 '15 at 22:30
  • What do you mean? The intersections of each pair of distributions? – Anton Dec 30 '15 at 23:38
  • I added two short examples. Is it what you asked about? – Anton Dec 31 '15 at 06:38