12

I'm attempting to use the sf package and a piped/tidyverse workflow to generate bounding boxes based on groups defined in another column. I think it should work like below, but it seems like st_bbox is not respecting groups.

I expect to receive three polygon records that represent bounding boxes around proints from a, b, and c, but instead I receive three polygon records that represent bounding boxes for all points.

library(dplyr)
library(sf)

a <- data.frame(group=rep('a',100), lon=rnorm(100,11,.2), lat=rnorm(100,53,.2))
b <- data.frame(group=rep('b',100), lon=rnorm(100,11.5,.2), lat=rnorm(100,53.5,.2))
c <- data.frame(group=rep('c',100), lon=rnorm(100,12,.2), lat=rnorm(100,54,.2))
dat <- rbind(a,b,c)

pts <- dat %>% st_as_sf(coords=c('lon','lat'),crs=4326) 

pts %>%
  group_by(group) %>%
  summarize(geometry = st_as_sfc(st_bbox(geometry)))

This returns:

Simple feature collection with 3 features and 1 field
geometry type:  POLYGON
dimension:      XY
bbox:           xmin: 10.34313 ymin: 52.43993 xmax: 12.54254 ymax: 54.54012
epsg (SRID):    4326
proj4string:    +proj=longlat +datum=WGS84 +no_defs
# A tibble: 3 x 2
  group                                                                                     geometry
  <fct>                                                                                <POLYGON [°]>
1 a     ((10.34313 52.43993, 12.54254 52.43993, 12.54254 54.54012, 10.34313 54.54012, 10.34313 52...
2 b     ((10.34313 52.43993, 12.54254 52.43993, 12.54254 54.54012, 10.34313 54.54012, 10.34313 52...
3 c     ((10.34313 52.43993, 12.54254 52.43993, 12.54254 54.54012, 10.34313 54.54012, 10.34313 52...
Ben Carlson
  • 1,053
  • 2
  • 10
  • 18
  • What's the format you want to end up with? I'm messing around with this and can get a list-column of `bbox` objects that are different for each group, but am having trouble figuring out a next step to cast as a multipolygon or something else that more closely resembles the `geometry` columns that `sf` objects usually have. But maybe that's as far as you intend to take it anyway. – camille Feb 14 '19 at 18:34

2 Answers2

12

One option is to use a nested dataframe using tidyr::nest and then purrr::map. I also used a wrapper function to simplify the map call

library(tidyverse)
box_sf <- pts %>% 
  group_by(group) %>%
  nest()

bbox_wrap <- function(x) st_as_sfc(st_bbox(x))
box_sf <- box_sf %>% 
  mutate(bbox = map(data, bbox_wrap))

This will give you a list of bounding boxes as a column of a dataframe. If you want to convert back to an sf object you can do this:

box_sf %>% 
  mutate(geometry = st_as_sfc(do.call(rbind, bbox))) %>% 
  select(-data, -bbox) %>% 
  st_as_sf()

Seems a little roundabout, I'm hoping to see a solution using group_by as you originally intended

astrofunkswag
  • 2,608
  • 12
  • 25
  • 1
    Thanks this is great! It would be great if st_bbox respected group_by, but sf is already really amazing and we can't have everything. – Ben Carlson Feb 18 '19 at 23:09
  • 1
    Reproducing the *convert back to an sf object* code I've the folloving error: ` Error in `mutate()`: ! Problem while computing `geometry = st_as_sfc(do.call(rbind, bbox))`. i The error occurred in group 3: group = "c". Caused by error in `attributes(ret) <- attributes(lst[[1]])`: ! dims [product 1] do not match the length of object [3]` – Tiziano Aug 22 '22 at 13:53
0

The st_bbox() function doesn't seem to work with group_by() because it pulls the bbox attribute from the sf_points which isn't defined for each individual group.

One way would be to create the bounding boxes manually using something like:

library(dplyr)
library(sf)
library(ggplot2)
library(tidyr)

# function calculates angle with respect to polygon centroid.
# we need this to order the polygon correctly
calc_angle <- function(lon,lat) {
  cent_lon <- mean(lon)
  cent_lat <- mean(lat)
  ang <- atan2(lat - cent_lat, lon - cent_lon)

  return(ang)
}

bbox <-dat %>%
  group_by(group) %>%
  summarise(xmin = min(lon),ymin = min(lat), xmax=max(lon),  ymax = max(lat)) %>%
  gather(x,lon,c('xmin','xmax')) %>%
  gather(y,lat,c('ymin','ymax')) %>%
  st_as_sf(coords=c('lon','lat'),crs=4326,remove=F) %>%
  group_by(group) %>%
  mutate(angle = calc_angle(lon,lat)) %>%
  arrange(angle) %>%
  summarise(do_union=FALSE) %>%
  st_cast('POLYGON')

Essentially we calculate our own bbox by getting the xmin,xmax,ymin,ymax for each group. We then gather to keep just the x and y values and sort the points in clockwise order before casting to polygon.

Looks a bit messy but it's one way to use group_by() to solve this.

pts <- dat %>% 
  st_as_sf(coords=c('lon','lat'), crs=4326, remove=F)

ggplot(pts) + geom_sf(aes(col=group)) +
  geom_sf(data=bbox, aes(col=group), fill=NA)

enter image description here

Chris
  • 3,836
  • 1
  • 16
  • 34