You can use gdalUtils::gdalwarp
for this. For me, it's less efficient than @JosephWood's fasterAgg.Fun
for rasters with 1,000,000 cells, but for Joseph's larger example it's much faster. It requires that the raster exists on disk, so factor writing time into the below if your raster is in memory.
Below, I've used the modification of fasterAgg.Fun
that returns the most frequent value, rather than its index in the block.
library(raster)
x <- matrix(rpois(10^8, 2), 10000)
a <- raster(x)
fasterAgg.Fun <- function(x,...) {
myRle.Alt <- function (x1) {
n1 <- length(x1)
y1 <- x1[-1L] != x1[-n1]
i <- c(which(y1), n1)
x1[i][which.max(diff(c(0L, i)))]
}
if (sum(x)==0) {
return(NA)
} else {
myRle.Alt(sort(x, method="quick"))
}
}
system.time(a2 <- aggregate(a, fact=10, fun=fasterAgg.Fun))
## user system elapsed
## 67.42 8.82 76.38
library(gdalUtils)
writeRaster(a, f <- tempfile(fileext='.tif'), datatype='INT1U')
system.time(a3 <- gdalwarp(f, f2 <- tempfile(fileext='.tif'), r='mode',
multi=TRUE, tr=res(a)*10, output_Raster=TRUE))
## user system elapsed
## 0.00 0.00 2.93
Note that there is a slight difference in the definition of the mode when there are ties: gdalwarp
selects the highest value, while the functions passed to aggregate
above (via which.max
's behaviour) select the lowest (e.g., see which.max(table(c(1, 1, 2, 2, 3, 4)))
).
Also, storing the raster data as integer is important (when applicable). If the data are stored as float (the writeRaster
default), for example, the gdalwarp
operation above takes ~14 sec on my system. See ?dataType
for available types.