1

I have a data of dimension 50x100000. (100000 features, each has a dimension of 50).

I would like to fit a gaussian mixture model using this data. I used the following code.

               obj = gmdistribution.fit(X',3);

What I need is when I give a new data Y I should be able to get the likelihood probabilities $p(Y|\theta)$, where $\theta$ are the gaussing mixture model parameters.

I used the following code to get the probability values.

               P = pdf(obj,X');

But I am getting very low values all are about 0. Whay it is happning? How can i get the appropreate probability values?

Amro
  • 123,847
  • 25
  • 243
  • 454
user570593
  • 3,420
  • 12
  • 56
  • 91
  • When you say that your data is dimension 50x100000, do you mean that you've got 100000 vectors of length 50, and that you're looking for a mixture of multivariate normal distributions, i.e. each distribution in the mixture is a multivariate normal distribution for a vector of length 50? – Stochastically Jun 14 '13 at 16:04

1 Answers1

4

In one dimension, the maximum value of the pdf of the Gaussian distribution is 1/sqrt(2*PI). So in 50 dimensions, the maximum value is going to be 1/(sqrt(2*PI)^50) which is around 1E-20. So the values of the pdf are all going to be of that order of magnitude, or smaller.

Stochastically
  • 7,616
  • 5
  • 30
  • 58
  • Thank you for the reply. what is PI here? I am planning to build a classifier to reject outliers from the probability values. I am learning this distribution based on one class of data. In my case how can I build a classifier based on the probability values? – user570593 Jun 14 '13 at 16:29
  • PI is 3.14159 etc. sqrt is the square-root function. I *think* that I've probably answered your original question, so if you agree, you might consider flagging this answer as useful and/or accepted. If you post another question related to how to build your classifier then I'll definitely have a look and post an answer to your new question if I can. FYI, I always take a look at all questions that have the `probability` tag. – Stochastically Jun 14 '13 at 16:33
  • Thank you for the reply. I have posted a new question at http://stackoverflow.com/questions/17113387/outlier-detection-based-on-gaussian-mixture-model looking for your reply. :-) – user570593 Jun 14 '13 at 16:41