4

I'm using the raster (2.1-49) package in R (3.0.1) to read in many rasters, calculate some statistics and store the results. Some of the rasters are too big to store in memory so are written as a temporary file in the folder different to the one indicated by tempdir(). The problem is that in reality I have over 5000 rasters, and the temporary files fill up my hard drive before the script finishes running. I would like to overwrite the same temporary file on each iteration of the loop. My code looks something like this:

require(raster)
names<- seq(1:5000)
for (i in 1:5000)
{
 r <- raster(paste("rast_",names[i],".tif"),sep="")
 #Stats Code#
}

Adding filename="C:/temp",overwrite=T to the raster function line of code did not work. However, these two additional options work with the rasterize function from the same package...

Is there a way to set a single temporary file that can be overwritten for the raster function?

Any help much appreciated.

JPD
  • 2,561
  • 5
  • 22
  • 26
  • Have you tried `rm(r)` at the end of each loop (i.e. after `#Stats Code#` and before `}`. – Simon O'Hanlon Sep 23 '13 at 09:04
  • I've just tried adding this in, and unfortunately multiple temporary .grd files are still being produced in the `tempdir()`. – JPD Sep 23 '13 at 09:36
  • So the `tempdir()` should be the same for the entire R session, so `rm( tempdir() )` could work instead. – Simon O'Hanlon Sep 23 '13 at 09:37
  • Maybe you could just create one temporary file name with `tempfile()`, and remove it with `rm` at the end of your code ? – juba Sep 23 '13 at 09:40
  • I've just tried setting the `tempfile()` but it doesn't influence the location of the temporary rasters. I've just discovered that the `raster` package seems to have it's own temporary directory different to the one indicated by `tempdir()`. (Have edited question to reflect this). – JPD Sep 23 '13 at 10:27
  • I'm missing something here: where's the code that writes `r` to some file? `raster` **reads** from a file. – Carl Witthoft Sep 23 '13 at 11:32
  • As far as I know, if `r` is too big to store in memory, a temporary .grd file is written. – JPD Sep 23 '13 at 12:09
  • @JPD -- I've always assumed that too (about what happens if `r` is too big to store in memory), but if you know where is documented, I'd like to see it. Unfortunately, `raster()` doesn't take an `outfilename` argument that would allow you to force it to write the rasterObject to disk, and let you set the name of the file it's written to. – Josh O'Brien Sep 23 '13 at 12:47
  • After a bit more poking around, it looks like some combo of `rasterTmpFile()`, `removeTmpFiles(h=0)`, `raster:::.tmpdir()`, `rasterOptions(tmpdir="name_your_own_tmpdir")`, and friends may help. Let us know how it goes. – Josh O'Brien Sep 23 '13 at 14:25
  • @Josh O'Brien Fantastic! Simply adding `removeTmpFiles(h=0)` within the loop at the end of the code block works fine. The other functions you've mentioned are also useful, thanks. Would you like to put this is as an answer? – JPD Sep 23 '13 at 15:11
  • 2
    @JPD -- Glad that helped. I would answer, but I don't like that as a solution in general, because it removes **all** temp files, not just the one you just created. It sounds like that's fine in your situation, and may well be in all other cases too, but I still don't understand where and when **raster** writes all of its temp files to disk, so I don't know what clashes `removeTmpFiles(h=0)` might cause. – Josh O'Brien Sep 23 '13 at 15:24

1 Answers1

0

This question is similar to this one. I found a better way to manage this problem drawing on this discussion, which creates a temporary directory from within the loop or parallel process that is tied to a unique name from the data that is being processed in the loop (in my case, the value of single@data$OWNER).

I am using parallel looping and, as @Josh O'Brien notes above, I don't want to remove all the files from a common temporary directory because it could remove other processes' temp files. Here's the code I used:

#creates unique filepath for temp directory
dir.create (file.path("c:/",single@data$OWNER), showWarnings = FALSE)

#sets temp directory
rasterOptions(tmpdir=file.path("c:/",single@data$OWNER)) 

You then insert your processing code here, then at the end of the loop delete the whole folder:

#removes entire temp directory without affecting other running processes
unlink(file.path("c:/",single@data$OWNER), recursive = TRUE)
Community
  • 1
  • 1
Luke Macaulay
  • 393
  • 5
  • 14