0
import numpy as np
from scipy.spatial import distance

d1 = np.random.randint(0, 255, size=(50))*0.9
d2 = np.random.randint(0, 255, size=(50))*0.7

vi = np.linalg.inv(np.cov(d1,d2, rowvar=0))   
res = distance.mahalanobis(d1,d2,vi)

print res

ValueError: shapes (50,) and (2,2) not aligned: 50 (dim 0) != 2 (dim 0)

Roman
  • 3,007
  • 8
  • 26
  • 54
  • What would be the output array shape, i.e. shape of `res`? Also, can you hand calculate the expected output for a very small, let's say for `d1` and `d2` as `3` elements each case? – Divakar Oct 25 '15 at 19:16
  • @Divakar the `res` is single number – Roman Oct 25 '15 at 19:22
  • If I am not mistaken, `vi` should be an estimate of the precision matrix of all your observations. `np.cov(d1, d2)` is probably not what you want. – cel Oct 25 '15 at 19:30
  • @cel doc says its inverse of covariance matrix http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.mahalanobis.html#scipy.spatial.distance.mahalanobis – Roman Oct 25 '15 at 19:39
  • You may want to check wikipedia to understand what exactly is measured by this distance. – cel Oct 25 '15 at 19:40
  • It's the wrong matrix, check out this answer http://stackoverflow.com/a/15068615/4016674 You need something like `np.linalg.inv(np.cov(np.vstack((d1, d2)).T))` – hellpanderr Oct 25 '15 at 19:45
  • @hellpanderrr It worked, but in other cases, when i inverted the matrix i got LinAlgError: Singular matrix. The matrix is not invertible, what to do in this case? – Roman Oct 25 '15 at 19:53
  • I would argue if you cannot invert the matrix, there is something else wrong going on. If you insist on it being inverted, you can use pseudo-inversion (ie `np.linalg.pinv`) – Julien Oct 25 '15 at 19:55
  • @Julien that's great it's not giving error. – Roman Oct 25 '15 at 19:57
  • @Julien overall the program did not failed, but the result not correct.. did not know how to use vi correctly... – Roman Oct 25 '15 at 20:05
  • Just passing by.... Hey you should use upgrade to python 3, won't solve your problem still... – Julien Palard Oct 25 '15 at 22:05
  • @JulienPalard thanks, i shall try. but why its different to py2.7... – Roman Oct 26 '15 at 00:07
  • @jean "Short version: Python 2.x is legacy, Python 3.x is the present and future of the language" https://wiki.python.org/moin/Python2orPython3 – Julien Palard Oct 26 '15 at 06:25

2 Answers2

1

The Mahalanobis distance computes the distance between two D-dimensional vectors in reference to a D x D covariance matrix, which in some senses "defines the space" in which the distance is calculated. The matrix encodes how various combinations of coordinates should be weighted in computing the distance.

It seems that you've computed the 2x2 sample covariance for your points, which is not the right type of covariance matrix to use in a mahalanobis distance.

If you don't already have a well-justified 50x50 covariance matrix which defines your mahalanobis metric, the mahalanobis distance is probably not the right choice for your application. Without more detail it's hard to give a better recommendation.

jakevdp
  • 77,104
  • 11
  • 125
  • 160
0

As mentioned in jakevdp's answer, your inverse covariance matrix must be of DxD dimensions, where D is the number of elements in your vectors. So, your code should be:

import numpy as np
from scipy.spatial import distance

d1 = np.random.randint(0, 255, size=(50))*0.9
d2 = np.random.randint(0, 255, size=(50))*0.7
m =zip(d1, d2)
v = np.cov(m)
try:
    vi = np.linalg.inv(v)
except:
    vi = np.linalg.pinv(v) #just in case the produced matrix cannot be inverted

res = distance.mahalanobis(d1,d2,vi)

print res
Ali Elbehery
  • 111
  • 1
  • 5