You can use raster::extract() to summarize raster data based on your polygons.
I often find it necessary to specify the package for extract to be sure you are grabbing the right version.
library(sp) # I'm assuming that's the source for st_read
library(raster)
my_shape <- st_read("multipolygon shapfile with 196 provinces inside Peru")
r <- raster("raster file of temp with 1km2 grid")
## not necessary to crop source raster
my_shape$mean <- raster::extract(r, my_shape, fun = mean, na.rm = TRUE)
This should add an attribute column to your province polygon layer. You can do the same with fun = min and fun = max.
If your raster layer is large, or you want to do this several times you will probably get better performance using alternative packages like velox
.
library(velox)
vs <- velox("raster file of temp with 1km2 grid")
my_shape$mean2 <- vs$extract(my_shape, fun = function(x){mean(x, na.rm= TRUE)})
The velox approach is much faster (see below)
Another alternative is to use the stars package, and use aggregate()
to summarize the min and max data. see How to extract values from a raster using polygons with the R stars package?
library(stars)
library(microbenchmark)
rstar = read_stars(raster_file)
st_crs(rstar) <- st_crs(my_shape)
f1 = function(){
my_shape$mean = raster::extract(r, my_shape, fun = mean, na.rm = TRUE)
}
f2 = function(){
vs <- velox(raster_file)
my_shape$mean2 <- vs$extract(my_shape, fun = function(x){mean(x, na.rm= TRUE)})
}
f3 = function(){
zones = rasterize(my_shape, r)
raster::zonal(r, zones)
}
f4 = function(){
x = aggregate(rstar, my_shape, mean, na.rm = T)
st_as_sf(x)
}
res = microbenchmark(f1(), f2(), f3(), f4(), times = 3)
res
# Unit: seconds
# expr min lq mean median uq max neval
# f1() 172.283024 174.644299 178.920625 177.005575 182.2394 187.473276 3
# f2() 3.339221 3.374647 3.396874 3.410072 3.4257 3.441327 3
# f3() 170.981220 179.481681 182.520730 187.982142 188.2905 188.598829 3
# f4() 167.076081 173.427232 179.332300 179.778383 185.4604 191.142437 3
velox$extract is the clear performance winner. The other three versions seem very similar.
The processing time for zonal is mostly in rasterizing the polygon layer, it would beat out the extract and aggregate if you are using for multiple metrics.
Reading the raster from the file is included in the velox expression, but not the other versions.
