0

I have a large matrix that is calculating the distance between two different zip codes (using rgeosphere package). I would like to run a function that finds all zip code pairings that are <=x distance away from each other and create a list of them. The data looks like this:

       91423  92231  94321
 90034  3     4.5    2.25
 93201  3.75  2.5    1.5
 94501  2     6      0.5

So if I ran the function to extract all zip code pairings that are <2 miles away I would end up with these zip codes:

94321
94321
93201
94501

The goal is basically to identify all adjacent zip codes in the US to a list of zip codes I have. If there is a better way to do this I am open to suggestions.

brandonps
  • 15
  • 5
  • Your output doesn't make sense because it doesn't have pairings. `94321` is < 2 miles away from what? – Gregor Thomas Dec 29 '17 at 20:24
  • Also, shouldn't your data be an upper- or lower-triangular matrix? – Gregor Thomas Dec 29 '17 at 20:26
  • Apologize for the poor wording, the "pairing" is (94321, 93201) & (94321, 94501). The values are the miles apart. So the final list would have 94321 twice, but I just removed the duplicate. – brandonps Dec 29 '17 at 20:30
  • As for the matrix, I was just using the `distm` function from `geosphere` packages which outputs "Distance matrix of a set of points, or between two sets of points" – brandonps Dec 29 '17 at 20:35

3 Answers3

1

Perhaps something like the following. It will be slow, but it should work.

for(i in 1:nrow(data)){
    for (j in 1:ncol(data)){
        if(data[i,j]<distance){
            if(exists(hold.zips)==FALSE){
                hold.zips<-matrix(c(colnames(data)[i],colnames(data)[j]),ncol=2)
            }else{
                temp<-matrix(c(colnames(data)[i],colnames(data)[j]),ncol=2)
                hold.zips<-rbind(hold.zips,temp)
            }
        }
    }
}
1

This should work. Gives a nice list as output (calling your data x):

rn = rownames(x)
apply(x, 2, function(z) rn[z < 2])
# $`91423`
# character(0)
# 
# $`92231`
# character(0)
# 
# $`94321`
# [1] "93201" "94501"
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
0

Here is the Tidyverse solution:

library(dplyr)
library(tidyr)

# your data
dat <- matrix(c(3,3.75,2,4.5,2.5,6,2.25,1.5,0.5), nrow = 3, ncol = 3)
rownames(dat) <- c(90034, 93201, 94501)
colnames(dat) <- c(91423, 92231, 94321)

# tidyverse solution
r <- rownames(dat)
dat_tidy <- dat %>%
  as_tibble() %>%
  mutate(x = r) %>%
  select(x, everything()) %>%
  gather(key = y,
         value = distance,
         -x) %>%
  filter(distance < 2)

print(dat_tidy)

# note if your matrix is a symetric matrix then
# to remove duplicates, filter would be:
# filter(x < y,
#        distance < 2)
AlphaDrivers
  • 136
  • 4