1

I would like to group polygons together based on a distance criteria:

  • Any polygon within a certain distance (1200 metres or less) of an origin polygon are grouped together
  • If other polygons are within the same distance (1200 metres or less) of these 'neighbouring' polygons they are added to this same group
  • The process for this group continues until no further polygons are added (because they are all further than 1200 metres away).
  • The next ungrouped polygon is selected and the process repeats for a new grouping
  • Polygons with no neighbour within 1200 metres are assigned to be in a group by themselves
  • A polygon should only belong to one group

The final output would be a table with the single polygon ID (UID) and the group ID it belongs to (GrpID) and the average distance between the polygons in that group

I am sure a distance matrix with st_distance means this is possible, but I'm just not getting it.

library(sf)
library(dplyr)

download.file("https://drive.google.com/uc?export=download&id=1-I4F2NYvFWkNqy7ASFNxnyrwr_wT0lGF" , destfile="ProximityAreas.zip")
unzip("ProximityAreas.zip")
Proximity_Areas <- st_read("Proximity_Areas.gpkg") 

Dist_Matrix <- Proximity_Areas %>% 
        st_distance(. , by_element = FALSE)
Chris
  • 1,197
  • 9
  • 28

1 Answers1

1

This function uses sf and igraph package functions:

group_polygons <- function(polys, distance){
    ## get distance matrix
    dist_matrix = st_distance(polys, by_element = FALSE)
    ## this object has units, so get rid of them:
    class(dist_matrix) = NULL
    ## make a binary 0/1 matrix where 1 if two polys are inside the distance threshold
    connected = dist_matrix < distance
    ## make a graph
    g = igraph::graph_from_adjacency_matrix(connected)
    return(components(g)$membership)
}

You can use it like this:

Proximity_Areas$Group = group_polygons(Proximity_Areas, 1200)

Let's make a category for mapping:

Proximity_Areas$FGroup = factor(Proximity_Areas$Group)
plot(Proximity_Areas[,"FGroup"])

enter image description here

There are three clusters here, the big one, one with 3 regions on the right, and one singleton region on the left. All the orange regions could be connected together by bridges that are less than 1200m long.

If you want to compute the average distance without re-computing the distance matrix, you can do this within the function by subsetting according to the membership value from the components function. The key here is computing the binary 0/1 matrix and using igraph to compute the connectivity of that as an adjacency matrix.

Spacedman
  • 92,590
  • 12
  • 140
  • 224