10

I am very lost in Euclidean distance calculation. I have found functions dist2{SpatialTools} or rdist{fields} to do this, but they doesn´t work as expected.

I suppose that one point has two coordinates in carthesian system, so [x,y]. To measure distance between 2 points (defined by row), I need 4 coordinates for 2 points, so point A: [x1,y1] point B: [x2,y2]

Points coordinations:

Points position

A[0,1]
B[0,0] 
C[1,1]
D[1,1]

I have two matrices: x1(A and C are there, defined by rows) and x2 (contain B and D). Written in matrix:

library("SpatialTools")
x1<-matrix(c(0,1,1,1), nrow = 2, ncol=2, byrow=TRUE)
x2<-matrix(c(0,0,1,1), nrow = 2, ncol=2, byrow=TRUE)

so I obtain

> x1
     [,1] [,2]
[1,]    0    1    #(as xy coordinates of A point)
[2,]    1    1    #(same for C point)

> x2
     [,1] [,2]
[1,]    0    0    #(same for B point)
[2,]    1    1    #(same for D point)

To calculate euclidean distance between

A <-> B  # same as x1[1,] <-> x2[1,]
C <-> D  # same as x1[2,] <-> x2[2,]

I assume to obtain EuclidDist:

> x1                           x2                         EuclidDist
     [,1] [,2]                      [,1] [,2]
[1,]    0    1    #A         [1,]    0    0    #B             1
[2,]    1    1    #B         [2,]    1    1    #D             0

I would like just to obtain vector of distances between two points identified by [x,y] coordinates, however, using dist2 I obtain a matrix:

> dist2(x1,x2)
         [,1] [,2]
[1,] 1.000000    1
[2,] 1.414214    0

My question is, which numbers describe the real Euclidean distance between A-B and C-D from this matrix? Am I misunderstanding something? Thank you very much for every advice or any explanation.

maycca
  • 3,848
  • 5
  • 36
  • 67

4 Answers4

17

If you just want a vector, something like this will work for you.

Try something like this:

euc.dist <- function(x1, x2) sqrt(sum((x1 - x2) ^ 2))

library(foreach)
foreach(i = 1:nrow(x1), .combine = c ) %do% euc.dist(x1[i,],x2[i,])

This will work for any dimensions.

If you don't want to use foreach, you can use a simple loop:

dist <- NULL
for(i in 1:nrow(x1)) dist[i] <- euc.dist(x1[i,],x2[i,])
dist

Although, I would recommend foreach (because it's very easy to for various tasks like this). Read more about it in the documentation of the package.

Shambho
  • 3,250
  • 1
  • 24
  • 37
  • What if there are two matrices which are of unequal lengths?? `x1<-matrix(c(0,1,1,1,2,1), nrow = 3, ncol=2, byrow=TRUE) x2<-matrix(c(0,0,1,1), nrow = 2, ncol=2, byrow=TRUE)` – Manoj Kumar Oct 13 '16 at 17:16
0

The diagonal is what you're looking for. The output matrix of dist2 shows the distance between all points. The row number in the output corresponds to the row in the first input, and column of the output corresponds to the row in the second input. Here's a diagram, hope it makes sense (this is the kind of thing I wish Stack Overflow supported MathJax for):

dist2( A_x A_y     C_x C_y      ( AC  AD
       B_x B_y  ,  D_x D_y )  =   BC  BD ) 

dist2(   x1     ,     x2   )  =   result

In your case, you want the distance from the first point of x1 to the first point of x2, then the second point of x1 to the second point of x2, hence the diagonal.

If you have a lot of data, and you only care about the corresponding pairs, you'll be much better off calculating this directly:

> x1 <- matrix(c(0, 1, 1, 1), ncol = 2, byrow = T)
> x2 <- matrix(c(0, 0, 1, 1), ncol = 2, byrow = T)
> sqrt(rowSums((x1 - x2)^2))
[1] 1 0

If you've got a whole lot of data (millions of points), it might be worth using foreach like @Shambho suggests.

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
0
library(rgdal)

library(sp)

##**COORDINATES** DATAFRAME THAT CONTENT THE LATITUDE (LAT) AND LONGITUDE 
##(LON) IN THE COORDINATE REFERENT SYSTEM (CRS) WGS84.

coordinates(COORDINATES) <- ~ LON + LAT

proj4string(COORDINATES) <- CRS("+proj=longlat +datum=WGS84") #ASSIGN THE CRS

Zone <- input$Zone   #UTM ZONE FOR YOUR COUNTRY

COORDINATES <- spTransform(COORDINATES, CRS(paste("+proj=utm", " +zone=", 
                           Zone, " +ellps=WGS84", " +datum=WGS84", " 
                           +units=m", sep="")))  #REPROJECT THE CRS
COORDINATES <- as.data.frame(COORDINATES)
X <- COORDINATES$LON  #EXTRACT THE LOGITUDE VECTOR
Y <- COORDINATES$LAT  #EXTRACT THE LATITUDE VECTOR
MX1 <- X %*% t(X) #CREATE A MATRIX FOR LONGITUDE VECTOR
MX2 <- matrix(rep(t(X),nrow(COORDINATES)), ncol = nrow(COORDINATES), 
              nrow = nrow(COORDINATES)) #CREATE A MATRIX FOR REPEAT LONGITUDE VECTOR
MX <- MX1/MX2 #DEFENITIVE MATRIX FOR LONGITUDE VECTORS
MX <- abs((MX-MX2)**2) #SQUARE SUM OF LONGITUDE VECTORS
colnames(MX)<- paste(COORDINATES$STATION) #ASSIGN COLNAMES
rownames(MX)<- paste(COORDINATES$STATION) #ASSIGN ROWNAMES
MY1 <- Y %*% t(Y) #CREATE A MATRIX FOR LATITUDE VECTOR
MY2 <- matrix(rep(t(Y), nrow(COORDINATES)), ncol = nrow(COORDINATES), 
              nrow = nrow(COORDINATES)) #CREATE A MATRIX FOR REPEAT LATITUDE VECTOR
MY <- MY1/MY2 #DEFENITIVE MATRIX FOR LATITUDE VECTORS
MY <- abs((MY-MY2)*2) #SQUARE SUM OF LONGITUDE VECTORS
colnames(MY)<- paste(COORDINATES$STATION) #ASSIGN COLNAMES
rownames(MY)<- paste(COORDINATES$STATION) #ASSIGN ROWNAMES
EUCLIDEAND <- round((sqrt(MX+MY)/1000), digits = 0) #EUCLIDEAN DISTANCE FOR THESE COORDINATES
EUCLIDEAND <- as.data.frame(EUCLIDEAND)
Jan Boyer
  • 1,540
  • 2
  • 14
  • 22
-1

You can always just apply the true equation (written for the sqldf package, but it can easily be converted):

sum(SQRT(power(a.LONG-b.lon,2)+power(a.LAT-b.lat,2))) AS DISTANCE
ChrisE
  • 1