1

I have a dataset of coordinates that are merged by time into one dataframe, with the individual IDs in the header. For example:

> Date_time<-c("2015/03/04 01:00:00","2015/03/04 02:00:00","2015/03/04 03:00:00","2015/03/04 04:00:00")
> lat.1<-c(63.81310,63.83336,63.83250,63.82237)
> long.1<-c(-149.1176,-149.0193,-149.0249,-149.0408)
> lat.2<-c(63.85893 ,63.85885,63.86108,63.86357)
> long.2<-c(-151.1336,-151.1336,-151.1236,-151.1238)
> lat.3<-c(63.87627,63.87670, 63.85044,63.85052)
> long.3<-c(-149.5029,-149.5021,-149.5199,-149.5199)
> 
> data<-data.frame(Date_time,lat.1,long.1,lat.2,long.2,lat.3,long.3)
> data
          Date_time lat.1  long.1  lat.2  long.2  lat.3  long.3
1 2015/03/04 01:00:00 63.8131 -149.1176 63.85893 -151.1336 63.87627 -149.5029
2 2015/03/04 02:00:00 63.8131 -149.1176 63.85893 -151.1336 63.87627 -149.5029
3 2015/03/04 03:00:00 63.8131 -149.1176 63.85893 -151.1336 63.87627 -149.5029
4 2015/03/04 04:00:00 63.8131 -149.1176 63.85893 -151.1336 63.87627 -149.5029

I want to calculate the distance between each of the individuals, so between 1 and 2, 1 and 3, and 2 and 3. My dataframe has many more individuals than this so I am hoping to apply a loop function.

I can do them individually using

> data$distbetween12<-distHaversine(cbind(data$long.1,data$lat.1), cbind(data$long.2,data$lat.2))
> data$distbetween12
[1] 99083.48 99083.48 99083.48 99083.48

But can I calculate all the pairwise distances without typing out every pair combination?

Thank you!

CED
  • 25
  • 5

1 Answers1

0

Here's a solution that relies on the combn function to generate the necessary combinations. If you have more than 3 pairs of lat, long columns, just change the first number in the combn function to the correct number of pairs.

Note this solution also requires that your columns strictly adhere to the naming lat.1 long.1, lat.2, long.2 etc.

 combos <- combn(3, 2)
 
 cbind(data, as.data.frame(`colnames<-`(apply(combos, 2, function(x) { 
   lats <- paste0("lat.", x)
   lons <- paste0("long.", x)
   geosphere::distHaversine(cbind(data[[lons[1]]], data[[lats[1]]]), 
                            cbind(data[[lons[2]]], data[[lats[2]]]))
   }), apply(combos, 2, paste, collapse = " v "))))
 
#>             Date_time    lat.1    long.1    lat.2    long.2    lat.3    long.3
#> 1 2015/03/04 01:00:00 63.81310 -149.1176 63.85893 -151.1336 63.87627 -149.5029
#> 2 2015/03/04 02:00:00 63.83336 -149.0193 63.85885 -151.1336 63.87670 -149.5021
#> 3 2015/03/04 03:00:00 63.83250 -149.0249 63.86108 -151.1236 63.85044 -149.5199
#> 4 2015/03/04 04:00:00 63.82237 -149.0408 63.86357 -151.1238 63.85052 -149.5199
#>       1 v 2    1 v 3    2 v 3
#> 1  99083.48 20172.13 79974.87
#> 2 103778.13 24168.80 80014.97
#> 3 103020.61 24374.46 78669.90
#> 4 102317.93 23724.27 78680.61

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87