1

I am trying to compute some distances for each combination of lat/lon values I have.

The first data frame looks like:

     lon   lat NOMVIAL               
   <dbl> <dbl> <chr>                 
 1 -99.1  19.5 Tepozanes             
 2 -99.0  19.3 Bartolomé Díaz de León
 3 -99.2  19.3 Renato Leduc          
 4 -99.2  19.2 Cuautlalpan

The second data frame looks like:

    CVEGEO mean_lat mean_lon
     <int>    <dbl>    <dbl>
1 90130143     19.2    -99.1
2 90130234     19.2    -99.0
3 90090300     19.2    -99.0

So I want to take each combination in df2 and compute the distances for each row in df1 and storing the results as a matrix. I can compute the distances for a single lat/lon value using:

df1 %>% 
  add_column(
    M_lat = -99.183203,
    M_long = 19.506582
) %>% 
  mutate(
    Distance = geosphere::distHaversine(cbind(lon, lat), cbind(M_lat, M_long))
  )

However, I would like to use map and store the results as a matrix, as below.

Expected output:

                 90020001 90030001 90040001 90040010 90040020

    
Tepozanes            999      111      ...      ...      ...
Renato Leduc
Samahil
...
Primera              ...      ...      ...      ...      ...

Where the column names come from the column GVEGEO column in the df2 and the row names come from the column NOMVIAL in the df1 data frame. The values being the distances computed.

Data:

df1 <- structure(list(lon = c(-99.12587729, -99.03630014, -99.16578649, 
-99.18373215, -99.21312146, -99.29082258, -99.19958018, -99.05745354, 
-99.09046923, -99.04686154), lat = c(19.543991, 19.2921902, 19.29272965, 
19.2346386, 19.29264198, 19.32628302, 19.29913009, 19.38650317, 
19.47120537, 19.31618134), NOMVIAL = c("Tepozanes", "Bartolomé Díaz de León", 
"Renato Leduc", "Cuautlalpan", "Samahil", "Ninguno", "Monte de Sueve", 
"Ninguno", "Rinoceronte", "Primera")), row.names = c(NA, -10L
), class = c("tbl_df", "tbl", "data.frame"))


df2 <- structure(list(CVEGEO = c(90130143L, 90130234L, 90090300L, 90130284L, 
90130290L), mean_lat = c(19.2256141377143, 19.2447500775758, 
19.209320585524, 19.2219817711111, 19.2405991752941), mean_lon = c(-99.143825052, 
-99.0409973439394, -98.9713545799563, -99.1172106433333, -99.1347260164706
)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"
))

EDIT:

To clarify, hopefully, a little more clearly.

I would like to take each row of df2 and use this to compute the distances of all rows in df1.

Take row 1 in df2

 CVEGEO       mean_lat      mean_lon
90130143      19.22561      -99.14383

Take these values and compute the distance for all of the rows in df1 - so for each of the 10 rows in df1 I will have a distance computed based on this row in df2. Then move to row 2 of df2 and do the same again...

So this matrix will have dimensions 5 columns and 10 rows.

user113156
  • 6,761
  • 5
  • 35
  • 81
  • I don't understand how these 2 data frames are connected. You would please explain more. And also when you iterate over every row of `df1` what values do you choose from the other data frame. – Anoushiravan R Jun 21 '21 at 19:57
  • 1
    I added an edit to explain a little more the details. – user113156 Jun 21 '21 at 20:04

1 Answers1

1

You can use the following solution. You have to swap mean_lon and mean_lat columns in your second data frame as you will get an error while lon and lat are exchanged.

library(dplyr)
library(purrr)
library(tidyr)
library(tibble)

map2(df4$lon, df4$lat, ~ df5 %>% 
       rowwise() %>%
       mutate(output = geosphere::distHaversine(c(.x, .y), c_across(mean_lon:mean_lat)))) %>%
  set_names(df4$NOMVIAL) %>%
  map(~ .x %>% 
        select(CVEGEO, output) %>% 
        pivot_wider(names_from = CVEGEO, values_from = output)) %>%
  bind_rows() %>%
  rownames_to_column() %>%
  mutate(rowname = df4$NOMVIAL)

# A tibble: 10 x 6
   rowname                `90130143` `90130234` `90090300` `90130284` `90130290`
   <chr>                       <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
 1 Tepozanes                  35492.     34483.     40636.     35857.     33786.
 2 Bartolomé Díaz de León     13513.      5304.     11476.     11549.     11831.
 3 Renato Leduc                7820.     14159.     22444.      9385.      6658.
 4 Cuautlalpan                 4313.     15044.     22501.      7133.      5193.
 5 Samahil                    10426.     18857.     27048.     12785.     10071.
 6 Ninguno                    19083.     27775.     36007.     21625.     18974.
 7 Monte de Sueve             10065.     17730.     25985.     12194.      9429.
 8 Ninguno                    20078.     15874.     21699.     19361.     18158.
 9 Rinoceronte                27908.     25739.     31724.     27885.     26088.
10 Primera                    14334.      7976.     14299.     12830.     12491.
Anoushiravan R
  • 21,622
  • 3
  • 18
  • 41