0

I am working with raster data and trying to crop and mask a variety of buffers to each raster in a raster stack for various locations. The result is a list of a list of rasters. I got the code to work for a small subset of the data, but now I am trying it over the whole dataset, and it is working very slowly. See example code:

# Example data ------------------------------------------------------------

#create example raster stack
r1 = raster(nrows=1000,ncol=1000,xmn=60,xmx=90,ymn=0,ymx=25)
rr = lapply(1:100, function(i) setValues(r1,runif(ncell(r1))))
rrstack=stack()
for (i in 1:length(rr)){
  stacknext=rr[[i]]
  rrstack=stack(rrstack,stacknext)
}


#create example shapefile list

lats=runif(26,min=0,max=25)
lons=runif(26,min=60,max=90)
exnames=paste0("city_",letters)
coords=data.frame(names=exnames,lats=lats,lons=lons)
coords_sf = st_as_sf(coords,coords=c("lons","lats"),crs=4326,dim ="XY")
circle1=st_buffer(coords_sf, 1E3)
circle100=st_buffer(coords_sf,1E5)
circle500=st_buffer(coords_sf,5E5)
circlist=list(circle1=circle1,circle100=circle100,circle500=circle500)

circlist_reproj=lapply(circlist,function(x) st_transform(x,crs(rrstack[[1]])))

start <- proc.time()
citlist <- vector(mode='list',length=nrow(circlist_reproj[[1]]))
dellist <- vector(mode='list',length=length(circlist_reproj))
mystack <- stack()
for(k in 1:nrow(circlist_reproj[[1]])) {
  for(j in 1:length(circlist_reproj)) { 
    for (i in 1:nlayers(rrstack)){
      maskraster <- raster::mask(rrstack[[i]],circlist_reproj[[j]][k,])
      maskraster <- raster::crop(maskraster,circlist_reproj[[j]][k,])
      mystack <- stack(mystack,maskraster)
    }
    dellist[[j]] <- mystack
    mystack <- stack()
  }
  citlist[[k]] <- dellist
  dellist <- vector(mode='list',length=length(circlist_reproj))
}
basetime <- proc.time()-start

#time taken for computation

 basetime
    user   system  elapsed 
 940.173   84.366 1029.688 

As you can see for a set of data smaller than what I have, the computation takes a while. I wanted to try and parallelize the processing but am having trouble figuring out how to do so. I have two issues with it right now. First because of the nature of the nested for loop, I am not sure which for loop I should change to foreach. According to this post, it looks like it is the first one, though I am not sure that stands for all nested for loops. When I make the first for loop foreach I then get the error Error in { : task 1 failed - "could not find function "nlayers"" I then try and add the package argument in the foreach call resulting in a nested for loop that looks like

foreach(k = 1:nrow(circlist_reproj[[1]], .packages='raster')) %dopar% {
  for(j in 1:length(circlist_reproj))  { 
    for (i in 1:nlayers(rrstack)) {
      maskraster <- raster::mask(rrstack[[i]],circlist_reproj[[j]][k,])
      maskraster <- raster::crop(maskraster,circlist_reproj[[j]][k,])
      mystack <- stack(mystack,maskraster)
    }
    dellist[[j]] <- mystack
    mystack <- stack()
  }
  citlist[[k]] <- dellist
  dellist <- vector(mode='list',length=length(circlist_reproj))
}

Which then gives the error

  unused argument (.packages = "raster")

So I am not sure how to properly apply the .packages argument to the foreach function. What am I doing wrong here?

EDIT

Taking @HenrikB's comment, I have looked at my code and reworked it. I now have the following foreach loops. Now the code completes, but it results in all null values.

cores <- detectCores()
cl <- makeCluster(cores[1]-2) #not to overload your computer
registerDoParallel(cl)

start <- proc.time()
citlist <- vector(mode='list',length=nrow(circlist_reproj[[1]]))
dellist <- vector(mode='list',length=length(circlist_reproj))
mystack <- stack()
foreach(k = 1:nrow(circlist_reproj[[1]])) %:%
  foreach(j = 1:length(circlist_reproj))%:%   
  foreach (i = 1:nlayers(rrstack), .packages=c('raster','sf')) %dopar% {
    maskraster <- raster::mask(rrstack[[i]],circlist_reproj[[j]][k,])
    maskraster <- raster::crop(maskraster,circlist_reproj[[j]][k,])
    mystack <- stack(mystack,maskraster)
    dellist[[j]] <- mystack
    mystack <- stack()
    citlist[[k]] <- dellist
    dellist <- vector(mode='list',length=length(circlist_reproj))
  }
partime <- proc.time()-start
johnnyg
  • 129
  • 8
  • Regarding `unused argument (.packages = "raster")`: it's a typo in your code - have a look again where you want that argument to be. – HenrikB Jan 17 '22 at 23:01
  • @HenrikB I addressed your comment and got rid of the error. Thanks for that! However, the code results in all NULL values, so the result is not quite what I am looking for yet. – johnnyg Jan 19 '22 at 18:59
  • Next: `foreach() %dopar% { ... }` is _not(!)_ a for loop. It's the number one most common misconception and mistake, which results into what you're experiencing. All results with `{ ... }` has to be _returned_, e.g. `y <- foreach(i=1:3) %dopar% { sqrt(i) }`. You _cannot_ assign it to something outside, i.e. `y <- list(); foreach(i=1:3) %dopar% { y[[i]] <- sqrt(i) }` won't work for the same reason `y <- list(); lapply(1:3, function(i) { y[[i]] <- sqrt(i) })` won't work. Search StackOverflow, and you'll see plenty of example of this mistake. – HenrikB Jan 19 '22 at 22:51

1 Answers1

0

After taking @Henrik's comments and reworking my code a bit, I was able to come up with a solution that solves the problem via parallelization, however it is slower than the base solving. But that is for another post. Here is the solution:

cores <- detectCores()
cl <- makeCluster(cores[1]-2) #not to overload your computer
registerDoParallel(cl)

citlist <- vector(mode='list',length=nrow(circlist_reproj[[1]]))
dellist <- vector(mode='list',length=length(circlist_reproj))
for(k in 1:nrow(circlist_reproj[[1]])) {
  for(j in 1:length(circlist_reproj)) { 
    parrasterstack <- foreach(i=1:nlayers(rrstack),.packages=c('raster','sf')) %dopar% {
    maskraster <- raster::mask(rrstack[[i]],circlist_reproj[[j]][k,])
    raster::crop(maskraster,circlist_reproj[[j]][k,])
    }
  parrasterstack <- stack(parrasterstack)
  dellist[[j]] <- parrasterstack
  parrasterstack <- NULL
  }
citlist[[k]] <- dellist
dellist <- vector(mode='list',length=length(circlist_reproj))
}

stopCluster(cl)
johnnyg
  • 129
  • 8