41

Is it possible to monitor the amount of memory that is in use or has been used by R to call a function? For example, I have an arbitrary function, e.g:

smallest.sv <- function(){
  A <- matrix(rnorm(1e6), 1e3);
  mysvd <- svd(A);
  return(tail(mysvd$d, 1));
}

Running the function simply returns a scalar, but a lot of memory was used to calculate the function. Now I need to do performance benchmarking. Processing time is easy:

system.time(x <- smallest.sv())

However I would also like to know how much memory was needed for this call, without modifying the function (it should work for arbitrary functions). Is there any way to do this?

Edit: to clarify a bit. I am mostly interested in the upper bound of memory that was in use during the call of the function, i.e. how much physical memory is required to be able to process the function call. In many cases this is significantly less than the total amount of allocated memory I think.

Jeroen Ooms
  • 31,998
  • 35
  • 134
  • 207
  • Hi Jeroen. Did you solve this problem? What was your solution? I am also facing this issue. I wish to monitor the upper bound of memory used during the call of the function. – SixSigma Feb 26 '16 at 23:30
  • Have a look at profvis: https://github.com/rstudio/profvis – Jeroen Ooms Feb 27 '16 at 11:29
  • thanks, Jerson. I did checked that before. But it seems to me that package is only used for profiling the time used by the code. I didn't see the functionality of memory monitoring. Did I miss something here? Or could you give me more hints? – SixSigma Feb 27 '16 at 16:06
  • See also this discussion on profiling memory usage of compiled code in R: https://stackoverflow.com/questions/58278838/memory-profiling-with-data-table – Michael Jul 17 '20 at 21:50

4 Answers4

23

R provides memory profiling support, see Section 3.3 of the Writing R Extensions manual :

3.3 Profiling R code for memory use

Measuring memory use in R code is useful either when the code takes more memory than is conveniently available or when memory allocation and copying of objects is responsible for slow code. There are three ways to profile memory use over time in R code. All three require R to have been compiled with `--enable-memory-profiling', which is not the default, but is currently used for the Mac OS X and Windows binary distributions. All can be misleading, for different reasons.

In understanding the memory profiles it is useful to know a little more about R's memory allocation. Looking at the results of `gc()' shows a division of memory into `Vcells' used to store the contents of vectors and `Ncells' used to store everything else, including all the administrative overhead for vectors such as type and length information. In fact the vector contents are divided into two pools. Memory for small vectors (by default 128 bytes or less) is obtained in large chunks and then parcelled out by R; memory for larger vectors is obtained directly from the operating system.

and then provides three more sections.

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
15

One option is to use Rprof. A simple approach is this:

Rprof(tf <- "rprof.log", memory.profiling=TRUE)

[your code]

Rprof(NULL)
summaryRprof(tf)

This will give you some information on memory usage.

Ryogi
  • 5,497
  • 5
  • 26
  • 46
  • Thanks this is useful. However, as I understand, Rprofmem logs all of the memory that was allocated, but it does not take into account the grabage collection? I am mostly interested in the upper bound of memory that is in use during the processing of the function. – Jeroen Ooms Oct 22 '11 at 00:00
  • 3
    You could use your OS's performance monitor. Take a reader before the operation and a reader after. On Windows that's perfmon – Suraj Oct 22 '11 at 16:19
5

You can get the upper bound of memory that is in use during the processing of a function and commands with gc:

smallest.sv <- function(){
  A <- matrix(rnorm(1e6), 1e3);
  mysvd <- svd(A);
  return(tail(mysvd$d, 1));
}

tt <- sum(.Internal(gc(FALSE, TRUE, TRUE))[13:14])
x <- smallest.sv()
sum(.Internal(gc(FALSE, FALSE, TRUE))[13:14]) - tt
#62 MB
rm(x)

This upper bound is influenced by garbage collection and so turning on gctorture will give the lowest upper bound:

tt <- sum(.Internal(gc(FALSE, TRUE, TRUE))[13:14])
gctorture(on = TRUE)
x <- smallest.sv()
gctorture(on = FALSE)
sum(.Internal(gc(FALSE, FALSE, TRUE))[13:14]) - tt
#53.7 MB

Other tools like Rprof, Rprofmem, profmem::profmem, bench::mark or profvis::profvis can also show the memory usage.

#Using Rprof (Enable profiling is a compile-time option: ./configure --enable-R-profiling)
gc()
Rprof("Rprof.out", memory.profiling=TRUE)
x <- smallest.sv()
Rprof(NULL)
max(summaryRprof("Rprof.out", memory="both")$by.total$mem.total)
#45.9
#Here at defined intervals the status is checked and so the result depends on if you hit the peak

#Using Rprofmem (Enable momory profiling is a compile-time option: ./configure --enable-memory-profiling)
Rprofmem("Rprofmem.out"); x <- smallest.sv(); Rprofmem(NULL) #Wen first run, there is much more in the log file
gc()
Rprofmem("Rprofmem.out")
x <- smallest.sv()
Rprofmem(NULL)
sum(as.numeric(read.table("Rprofmem.out", comment.char = ":")[,1]), na.rm=TRUE)
#88101752
#Writes out them memory amount when it is allocated

library(profmem) #uses utils::Rprofmem
gc()
total(profmem(x <- smallest.sv()))
#88101752

library(bench) #uses utils::Rprofmem
gc()
mark(x <- smallest.sv())[,"mem_alloc"]
#84MB
#Warning message:
#Some expressions had a GC in every iteration; so filtering is disabled. 

library(profvis) #uses utils::Rprof
gc()
profvis(x <- smallest.sv())
#opens a browser window where you can read under Memory -23.0 | 45.9

Rprofmem shows the memory that was cumulative allocated and does not consider the memory which was freed during execution. To increase the probability of Rprof to hit the peak you can select a short time interval or/and repeat the procedure.

max(replicate(10, {
    gc()
    Rprof("Rprof.out", memory.profiling=TRUE, interval = runif(1,.005,0.02))
    x <- smallest.sv()
    Rprof(NULL)
    max(summaryRprof("Rprof.out", memory="both")$by.total$mem.total)
}))
#76.4

Here I gott a higher value than I get from gc, what demonstrates, that the memory usage is influenced by garbage collection and the upper bound of memory that is in use during the processing of a function may vary from call to call as long as gctorture is not turned on.

GKi
  • 37,245
  • 2
  • 26
  • 48
3

You can get processing time as well as peak memory with PeakRAM:

library(peakRAM)

peakRAM(smallest.sv())

  Function_Call Elapsed_Time_sec Total_RAM_Used_MiB Peak_RAM_Used_MiB
1 smallest.sv()             3.64                  0              61.1 
Waldi
  • 39,242
  • 6
  • 30
  • 78