What is the quickest way of cropping (clipping) a complex SpatialPolygonsDataFrame with the @data
preserved in R using another, possibly complex, SpatialPolygon? I know of two ways (shown under). The raster
way is quicker for less complex SpatialPolygonsDataFrames and returns a SpatialPolygonsDataFrame as in the example. It gets slow with large SpatialPolygonsDataFrames, though. The rgeos
way is quicker for large SpatialPolygonsDataFrame, but it sometimes fails with very complex geometries and does not return SpatialPolygonsDataFrames as default.
I have not paid attention to the GIS development in R lately and there might well be quicker ways of doing this by now.
The example polygons are small to honor people's computers and bandwidth. Consider the "real" polygons 50-1000 MB.
Setup:
library(rnaturalearth)
library(sp)
library(raster)
library(rgeos)
dt <- rnaturalearth::ne_countries()
clip_boundary <- sp::SpatialPolygons(list(sp::Polygons(
list(sp::Polygon(data.frame(lon = c(0, 180, 180, 0), lat = c(40, 40, 80, 80)))), ID = 1)))
The rgeos
way:
system.time({
clipped.dt <- rgeos::gIntersection(dt, clip_boundary, byid = TRUE)
ids <- sapply(slot(clipped.dt, "polygons"), function(x) slot(x, "ID"))
ids <- sapply(strsplit(ids, " "), "[", 1)
tmp.df <- data.frame(dt@data[ids,])
names(tmp.df) <- names(dt@data)
out <- sp::SpatialPolygonsDataFrame(clipped.dt, tmp.df, match.ID = FALSE)
})
# user system elapsed
# 0.069 0.002 0.074
class(out)
# [1] "SpatialPolygonsDataFrame"
# attr(,"package")
# [1] "sp"
The raster
way:
system.time({
out <- raster::crop(dt, clip_boundary)
})
# user system elapsed
# 0.042 0.001 0.043
class(out)
# [1] "SpatialPolygonsDataFrame"
# attr(,"package")
# [1] "sp"
A plot for a reference (not relevant for the question):
plot(out)