5

I have a high dimensional Gaussian with mean M and covariance matrix V. I would like to calculate the distance from point p to M, taking V into consideration (I guess it's the distance in standard deviations of p from M?).

Phrased differentially, I take an ellipse one sigma away from M, and would like to check whether p is inside that ellipse.

Elli Amir
  • 365
  • 1
  • 3
  • 9

3 Answers3

4

If V is a valid covariance matrix of a gaussian, it then is symmetric positive definite and therefore defines a valid scalar product. By the way inv(V) also does.

Therefore, assuming that M and p are column vectors, you could define distances as:

d1 = sqrt((M-p)'*V*(M-p));
d2 = sqrt((M-p)'*inv(V)*(M-p));

the Matlab way one would rewrite d2as (probably some unnecessary parentheses):

d2 = sqrt((M-p)'*(V\(M-p)));

The nice thing is that when V is the unit matrix, then d1==d2and it correspond to the classical euclidian distance. To find wether you have to use d1 or d2is left as an exercise (sorry, part of my job is teaching). Write the multi-dimensional gaussian formula and compare it to the 1D case, since the multidimensional case is only a particular case of the 1D (or perform some numerical experiment).

NB: in very high dimensional spaces or for very many points to test, you might find a clever / faster way from the eigenvectors and eigenvalues of V (i.e. the principal axes of the ellipsoid and their corresponding variance).

Hope this helps.

A.

Adrien
  • 1,455
  • 9
  • 12
3

Consider computing the probability of the point given the normal distribution:

M = [1 -1];             %# mean vector
V = [.9 .4; .4 .3];     %# covariance matrix
p = [0.5 -1.5];         %# 2d-point
prob = mvnpdf(p,M,V);   %# probability P(p|mu,cov)

The function MVNPDF is provided by the Statistics Toolbox

Amro
  • 123,847
  • 25
  • 243
  • 454
0

Maybe I'm totally off, but isn't this the same as just asking for each dimension: Am I inside the sigma?

PSEUDOCODE:

foreach(dimension d)
    (M(d) - sigma(d) < p(d) < M(d) + sigma(d)) ?

Because you want to know if p is inside every dimension of your gaussian. So actually, this is just a space problem and your Gaussian hasn't have to do anything with it (except for M and sigma which are just distances).

In MATLAB you could try something like:

all(M - sigma < p < M + sigma)

A distance to that place could be, where I don't know the function for the Euclidean distance. Maybe dist works:

dist(M, p)

Because M is just a point in space and p as well. Just 2 vectors. And now the final one. You want to know the distance in a form of sigma's:

% create a distance vector and divide it by sigma
M - p ./ sigma

I think that will do the trick.

Marnix
  • 6,384
  • 4
  • 43
  • 78
  • This solution will not work, since it does not take into consideration the covariance (the ellipse is "slanted", in a way). Thanks though! – Elli Amir Dec 15 '10 at 21:51