0

I want to calculate GLCM with 488 raster files. Because of the enormous calculation time i want to use all the power of my multicore processor (AMD Phenom II 6-core).

library("glcm")
library(raster)
library(devtools)
install_github('azvoleff/glcm')

setwd(working dir.)
rasters <- list.files()[grep("()\\w*.tif", list.files())]
statistics <- c("mean", "variance", "homogeneity", "contrast", "dissimilarity", "entropy","second_moment", "correlation")
shift1 <- c(0,0,1,1)
shift2 <- c(0,1,0,1)

for (j in 1:length(rasters)){ 
  raster1 <- raster(rasters[j])
  for (i in 1:length(statistics)){
    for (k in 1:length(shift1)){
      GLCM <- glcm(raster1, window=c(11,11), statistics=statistics[i], shift = c(shift1[k],shift2[k]), na_opt="ignore")

      file <- paste("./GLCM/", substr(tiles[j],0,nchar(tiles[j])-4),"_", statistics[i], "_shift_",shift1[k], shift2[k] , ".tif", sep="")
      writeRaster(GLCM, filename = file, type = "GTIFF")    
    }

  }
  gc()
}

I searched the internet for multicore solutions in R, but could not find out which one is up to date. So I hope someone can help me.

Tonechas
  • 13,398
  • 16
  • 46
  • 80
loki
  • 9,816
  • 7
  • 56
  • 82
  • Which solutions did you find? `doParallel` and `doMC`, which work with `foreach`, were updated ~2 months ago. `multicore` is also a good option. What OS are you using? – Jake Burkhead Apr 24 '14 at 16:53
  • The problem with getting value from multiple cores is to have a parallel algorithm. It's not clear to me that you have addressed that issue. – IRTFM Apr 24 '14 at 16:53
  • @JakeBurkhead I found these, but do not know how exactly to apply my code on these. @BondedDust: How do I find out if `glcm` is a parallel algorithm? – loki Apr 24 '14 at 17:05
  • 1
    What specific issues did you have with your code? You should be able to change the outer loop to a `foreach` loop and use `doParallel` as a parallel backend. http://cran.r-project.org/web/packages/doParallel/vignettes/gettingstartedParallel.pdf – Jake Burkhead Apr 24 '14 at 17:21

1 Answers1

3

glcm is not coded to run in parallel, but given that you are processing 488 rasters, I wouldn't worry about running the algorithm itself in parallel - processing the rasters in parallel (say two at a time on an average laptop machine, more if you have more processing pwer and RAM) is the simplest approach here. glcm versions > 1.4 will automatically run block by block over large images (and will account for edge effects), so memory shouldn't be an issue.

Something like the below should get you started (based on your code):

library(glcm)
library(raster)
library(foreach)
library(doparallel)

cl <- makeCluster()
registerDoParallel(cl)

setwd(working dir.)
rasters <- list.files()[grep("()\\w*.tif", list.files())]
statistics <- c("mean", "variance", "homogeneity", "contrast",
                "dissimilarity", "entropy","second_moment",
                "correlation")
shift1 <- c(0, 0, 1, 1)
shift2 <- c(0, 1, 0, 1)

foreach (j in 1:length(rasters), .packages=c('raster', 'glcm')) %dopar% {
  raster1 <- raster(rasters[j])
  for (i in 1:length(statistics)) {
    for (k in 1:length(shift1)) {
      GLCM <- glcm(raster1, window=c(11,11), statistics=statistics[i],
                   shift = c(shift1[k],shift2[k]), na_opt="ignore")
      file <- paste("./GLCM/", substr(tiles[j], 0, nchar(tiles[j])-4),
                    "_", statistics[i], "_shift_",shift1[k], shift2[k],
                    ".tif", sep="")
      writeRaster(GLCM, filename = file, type = "GTIFF")
    }
  }
}

stopCluster(cl)
Alex Zvoleff
  • 442
  • 2
  • 13