Using distance matrix to find coordinate points of set of points

Question

Given a distance matrix and a set of points, how do you figure out the coordinates of these points?

Edit: This is on a plane.

This question was answered here but in trying different distance matrices, I really couldn't use this answer because the M matrix had negative values, as did my eigenvectors. So when you took the square root, the program (in R) outputs "NaN" for those associated entries. I'm guessing this will happen every time D(i,j)^2 is greater than D(1,j)^2 + D(i,1)^2.

For example, say I have a distance matrix:

0    73   102  496  432  184
73    0   303  392  436  233
102  303    0  366  207  353
496  392  366    0  172  103
432  436  207  172    0  352
184  233  353  103  352    0

Using the equation M(i,j) = (0.5)(D(1,j)^2+D(i,1)^2-D(i,j)^2), I get (which already has negative entries):

0      0.0      0.0      0.0      0.0      0.0
0   5329.0 -38038.0  48840.5    928.5  -7552.0
0 -38038.0  10404.0  61232.0  77089.5 -40174.5
0  48840.5  61232.0 246016.0 201528.0 134631.5  
0    928.5  77089.5 201528.0 186624.0  48288.0
0  -7552.0 -40174.5 134631.5  48288.0  33856.0

Then I get non - zero eigenvalues & eigenvectors:

477718.27  101845.63   16474.30  -13116.72 -100692.49


        [,1]       [,2]        [,3]        [,4]        [,5]
 0.00000000  0.0000000  0.00000000  0.00000000  0.00000000
-0.05928626  0.3205747  0.84148945  0.04869546 -0.42806691
-0.16650486 -0.5670946 -0.04507520 -0.58222690 -0.55647098
-0.73371713  0.2827320  0.07386302 -0.45957443  0.40627254
-0.59727407 -0.4623603  0.07806418  0.64968004 -0.03617241
-0.27144823  0.5309625 -0.52755471  0.15920983 -0.58372335

Since there are both negative eigenvalues and eigenvectors, when we compute sqrt(eigenvector(i)*eigenvalue(i)), we'll have negative values. Here is my final output:

[,1]     [,2]      [,3]     [,4]      [,5]
   0   0.0000   0.00000  0.00000   0.00000
 NaN 180.6907 117.74103      NaN 207.61291
 NaN      NaN       NaN 87.38939 236.71174
 NaN 169.6910  34.88326 77.64089       NaN
 NaN      NaN  35.86158      NaN  60.35139
 NaN 232.5429       NaN      NaN 242.43877

Is this the only clear way of computing the coordinate points without using angles? If it is, do we have to fix the distance matrix so D(i,j)^2 is not greater than D(1,j)^2 + D(i,1)^2.

Thanks.

See http://en.wikipedia.org/wiki/Distance_geometry and the references therein. — MvG, Aug 07 '13 at 14:45
Are you talking about points in the plane, or points in 3-space, or perhaps even something in a higher dimension? — MvG, Aug 07 '13 at 16:49
@MvG Perhaps I'm setting this up wrong? It's supposed to be on a plane. For example, the distance matrix is showing the distance from point 1 to point 1, point 1 to point 2, point 1 to point 3, etc. — astudent, Aug 07 '13 at 19:14

MvG · Accepted Answer · 2013-08-07T21:37:07.003

Your data is inconsistent

Your coordinates are not consistent with positions of points in ℝ⁴, let alone a space of lower dimension. You can tell that fact by computing the Menger determinant of your squared distance matrix:

D <- as.matrix(read.table(textConnection("\
0    73   102  496  432  184
73    0   303  392  436  233
102  303    0  366  207  353
496  392  366    0  172  103
432  436  207  172    0  352
184  233  353  103  352    0")))
n <- nrow(D)
det(rbind(cbind(D^2, 1), c(rep(1, n), 0)))
# Result: 3.38761e+25

If your coordinates really came from points in a space of dimension less than five, then this determinant would have to be zero. As it is not, your distances are inconsistent, or the points form a simplex in a space of sufficiently high dimension.

But no mater the dimension, your data is still inconsistent since it violates the triangle inequality in several cases:

a b c   ac   abc    ab    bc
1 2 4: 496 > 465 =  73 + 392
1 3 4: 496 > 468 = 102 + 366
1 3 5: 432 > 309 = 102 + 207
1 6 4: 496 > 287 = 184 + 103
2 1 3: 303 > 175 =  73 + 102
2 6 4: 392 > 336 = 233 + 103
3 1 6: 353 > 286 = 102 + 184
5 4 6: 352 > 275 = 172 + 103

Going from a to c directly can never take longer than going via b, but according to your data it does.

Simple planar approach

If you had data consistent with points in the plane (i.e. all Menger determinants for combinations of four points evaluate to zero), you could use the following to obtain coordinates:

distance2coordinates <- function(D) {
  n <- nrow(D)
  maxDist <- which.max(D)
  p1 <- ((maxDist - 1) %% n) + 1
  p2 <- ((maxDist - 1) %/% n) + 1
  x2 <- D[p1, p2]
  r1sq <- D[p1,]^2
  r2sq <- D[p2,]^2
  x <- (r1sq - r2sq + x2^2)/(2*x2)
  y <- sqrt(r1sq - x^2)
  p3 <- which.max(y)
  x3 <- x[p3]
  y3 <- y[p3]
  plus <- abs(D[p3,]^2 - (x3 - x)^2 - (y3 - y)^2)
  minus <- abs(D[p3,]^2 - (x3 - x)^2 - (y3 + y)^2)
  y[minus < plus] <- -y[minus < plus]
  coords <- data.frame(x = x, y = y)
  return(coords)
}

The idea is that you choose two points with maximal distance as starting points. You place on in the origin and the other on the positive x axis. Then you can compute all other x coordinates from this, as the intersection of two circles, following the equations

I:     x²       + y² = r₁²
II:   (x - x₂)² + y² = r₂²
I-II:  2*x*x₂ = r₁² - r₂² + x₂²

Given these x coordinates, you can obtain y coordinates as well, up to sign. You then choose a third point, sufficiently far away from either of these two starting points, to decide on the sign.

This approach makes no attempt at all to handle imprecise input. It assumes exact data, and will only use part of the distance matrix to find the points. It will not find the point set most closely matching all of the input data.

On your data, this will fail, since some arguments to the square root will be negative. This means that the two circles involved don't intersect at all, hence the triangle inequality is violated.

If it is, do we have to fix the distance matrix so D(i,j)^2 is not greater than D(1,j)^2 + D(i,1)^2.

D(i,j) ≤ D(i,k) + D(k,j) would help, i.e. for all triples and without squares. This would ensure that the triangle inequality holds everywhere. The result still need not be planar; for that you'd have to fix all those Menger determinants.

@ MvG I'm not understanding exactly what you mean by data that's consistent with points on a plane.The distance resembles the actual distance (in real world) of four locations, so say I have town 1 and town 2, then town 1 is 72 miles away from town 2. — astudent, Aug 07 '13 at 19:18
@astudent: That cannot be. At least not if by “distance” you mean Euclidean straight line distance on the plane. I now included a more detailed exposure indicating the violated triangle inequalities, which should be easier to understand than the violated Menger determinants. Either one says your data does not come from points in the plane. — MvG, Aug 07 '13 at 21:39
I guess I understand what you're saying, and perhaps I just keep thinking of things in terms of the real world, but is not possible in actual geographical terms that any location is at any distance from another? And wouldn't the numbers be affected depending which location within the city you use as a measuring basis, for example if you measure from the center of each location? Anyway, thank you! I'll look at the distance matrix again and see how to fix it. This has been helpful. — astudent, Aug 08 '13 at 00:05
Forget it, I think I understand. The issue will still be that the triangle inequality can't be violated. By the way, why are you looking at R^4? — astudent, Aug 08 '13 at 00:13
@astudent: Suppose you have 4 points, and all distances are 1. Then you know they form a regular tetrahedron in ℝ³, but won't fit into ℝ². In a similar way, your 6 points might have fit in ℝ⁵ (if all triangle inequalities were met), but since that determinant was nonzero, they could not have fit into ℝ⁴ or lower. For n points in the plane, you can choose 2n coordinates. A rigid motion won't affect distances, so there are only 2n-3 real degrees of freedom in the distance setup. Half the distance matrix contains n(n-1)/2 values, so there is redundancy and therefore room for inconsistency. — MvG, Aug 08 '13 at 07:15
Okay, I see. That makes sense. I tried using another matrix (using distances provided by Google) and checked that the triangle inequality holds and it does but I tried what you did at first, checking the determinant using n <- nrow(D) det(rbind(cbind(D^2, 1), c(rep(1, n), 0))) and I'm still not getting zero for the determinant, so I think I just need to read more about the Menger determinant (I've never worked with it). Thank you for all you help. — astudent, Aug 08 '13 at 07:40
However, I did use the simple planar approach with this set and the coordinates seem to be correct. I plotted the points and it resembles where NYC, LA, Oklahoma City and Miami are located. I guess I just want to understand more the math behind everything. But once again, thank you. — astudent, Aug 08 '13 at 07:56
@astudent: If your locations are spread over large parts of the US, then the planar approximation no longer holds: earth is *not* a plane, and that effect becomes noticable at these distances, resulting in a non-zero Menger determinant for 4 points. If it also violates Menger determinants for more points, indicating an inconsistency even for points in space, then this is most likely due to the fact that your distances are geodesic distances along the surface of the earth, not Euclidean distances between arbitrary points in space. Not sure how far distance geometry on a sphere has been studied. — MvG, Aug 08 '13 at 08:04

user2974951 · Answer 2 · 2017-09-29T06:56:37.253

This is a simple python function to calculate what you need, solving hyperspheres.

import sympy
import numpy as np
def give_coords(distances):
    """give coordinates of points for which distances given

    coordinates are given relatively. 1st point on origin, 2nd on x-axis, 3rd 
    x-y plane and so on. Maximum n-1 dimentions for which n is the number
    of points

     Args:
        distanes (list): is a n x n, 2d array where distances[i][j] gives the distance 
            from i to j assumed distances[i][j] == distances[j][i]

     Returns:
        numpy.ndarray: cordinates in list form n dim

     Examples:
        >>> a = sympy.sqrt(2)
        >>> distances = [[0,1,1,1,1,1],
                         [1,0,a,a,a,a],
                         [1,a,0,a,a,a],
                         [1,a,a,0,a,a],
                         [1,a,a,a,0,a],
                         [1,a,a,a,a,0]]
        >>> give_coords(distances)
        array([[0, 0, 0, 0, 0],
               [1, 0, 0, 0, 0],
               [0, 1, 0, 0, 0],
               [0, 0, 1, 0, 0],
               [0, 0, 0, 1, 0],
               [0, 0, 0, 0, 1]], dtype=object)

        >>> give_coords([[0, 3, 4], [3, 0, 5], [4, 5, 0]])
        array([[0, 0],
        [3, 0],
        [0, 4]], dtype=object)        

    """
    distances = np.array(distances)

    n = len(distances)
    X = sympy.symarray('x', (n, n - 1))

    for row in range(n):
        X[row, row:] = [0] * (n - 1 - row)

    for point2 in range(1, n):

        expressions = []

        for point1 in range(point2):
            expression = np.sum((X[point1] - X[point2]) ** 2) 
            expression -= distances[point1,point2] ** 2
            expressions.append(expression)

        X[point2,:point2] = sympy.solve(expressions, list(X[point2,:point2]))[1]

    return X

Ian Quinn · Answer 3 · 2016-07-29T00:18:08.760

This is Solvable

If you would like to see cartesian-type coordinates which satisfy the distance matrix you provided in your question, then please view the following image.
distances matrix and coordinates Your input matrix gives the distances between 6 nodes which we shall call a, b, c, d, e, and f. There are a total of 5 dimensions required to be able to assign coordinates to all six nodes which satisfy your distance matrix. Two of these dimensions are imaginary valued -- which is a consequence of breaking the triangle rule. The results were arrived at by using the law of cosines and a bit of number crunching.

a (0, 0, 0, 0, 0)
b (73, 0, 0, 0, 0)
c (-521.07, 510.99i, 0, 0, 0)
d (669.05, -802.08i, 664.62, 0, 0)
e (12.72, -163.83i, 488.13, 158.01i, 0)
f (-103.45, 184.11i, 84.52, 138.06i, 262.62)

Using distance matrix to find coordinate points of set of points

3 Answers3

Your data is inconsistent

Simple planar approach

Linked