0

In the EM step of GMM, I call a function gaussianND as:

pdf(:, j) = gaussianND(unseen_data, mu(j, :), sigma{j});

which evaluates gaussian for all data points for each cluster 'j'. I have 150 data points and 10 clusters.

I get an error: "Warning: Matrix is singular, close to singular or badly scaled. Results may be inaccurate. RCOND = NaN. " in the following line of code of the gaussianND function:

pdf = 1 / sqrt((2*pi)^n * det(Sigma)) * exp(-1/2 * sum((meanDiff * inv(Sigma) .* meanDiff), 2));

which basically calculates the multivariate gaussian. For the single iteration of the EM step, I get cluster probabilities (the probability that each data point belongs to each cluster) which makes sense, however with more than 1 iteration I get all of my cluster probabilities as 'NaN' and the warning above.

Can somebody explain me why and a solution please?

Yeshi
  • 11
  • 2
  • it seems like you have an empty (or nearly empty) cluster formed in one of your iterations. Therefore, Matlab is unable to estimate the covariance matrix for that cluster. If you only have 150 points, try using less than 10 clusters. – Shai Jan 19 '16 at 06:59
  • @Shai is there any other way? Because I need to have my data into 10 clusters exactly. – Yeshi Jan 19 '16 at 07:40
  • can you add more samples? – Shai Jan 19 '16 at 07:47
  • @Shai Even if I do reduce my clusters to 2, I get the NaN values in two iterations. – Yeshi Jan 19 '16 at 10:11
  • What is the dimensions of your points? how well scattered they are? – Shai Jan 19 '16 at 10:16
  • 150 data points of 73 dimensions. They are facial attributes, 15 images per person. So 10 clusters is a must. and I think they should be considerably close to each other cluster-wise. – Yeshi Jan 19 '16 at 10:52
  • How come you have 73 dimensions? If you are trying to locate 2 eyes + nose + 2 mouth corners you have total 5 x values and 5 y values- are these your 10 clusters? What is the `size` of `unseen_data`? what is `size(meanDiff)`, and `size(Sigma)`? – Shai Jan 19 '16 at 10:56

0 Answers0