4

I sometimes work with lots of objects and it would be nice to have a fresh start because of memory issues between chunks. Consider the following example:

warning: I have 8GB of RAM. If you don't have much, this might eat it all up.

<<chunk1>>=
a <- 1:200000000
@
<<chunk2>>=
b <- 1:200000000
@
<<chunk3>>=
c <- 1:200000000
@

The solution in this case is:

<<chunk1>>=
a <- 1:200000000
@
<<chunk2>>=
rm(a)
gc()
b <- 1:200000000
@
<<chunk3>>=
rm(b)
gc()
c <- 1:200000000
@

However, in my example (which I can post because it relies on a large dataset), even after I remove all of the objects and run gc(), R does not clear all of the memory (only some). The reason is found in ?gc:

However, it can be useful to call ‘gc’ after a large object has
been removed, as this may prompt R to return memory to the
operating system.

Note the important word may. R has a lot of situations where it specifies may like this and so it is not a bug.

Is there a chunk option according to which I can have knitr start a new R session?

Xu Wang
  • 10,199
  • 6
  • 44
  • 78
  • Perhaps if you called (from the chunk) `Rscript` with the code to save these objects in a way that could be lazy loaded? – mnel Oct 09 '12 at 03:07
  • 1
    if you start a new R session for each chunk, the objects created in one chunk will not be available to any other chunks, then what is the point of putting all these objects in the same document? BTW, `cache=TRUE` will save objects and lazy load them later, which will not consume your memory unless you really use these objects later (explained in the manual: https://github.com/downloads/yihui/knitr/knitr-manual.pdf). Have you tried this option? – Yihui Xie Oct 09 '12 at 05:00
  • @Yihui yes, I have tried that. The problem is not repeated compiles (which caching would help with). The problem is with just compiling once. To answer your question about what's the point, I think it's understandable to want to modularize your code and your writing. Here is a programming analogy response to your question: Why break your code up modularly into different functions or different .c files when you could just put everything into `main`? – Xu Wang Oct 09 '12 at 07:05
  • 1
    @XuWang I understand modularization, but I do not understand your analogy. I mean your "modules" would be completely independent of each other if you start them in new R sessions, i.e. you will not be able to actually _use_ any "modules" later in this document, then why put them in this document at all? – Yihui Xie Oct 09 '12 at 16:18
  • @Yihui I see your point. I will rethink my code. Thank you for your help and for your work on knitr! – Xu Wang Oct 09 '12 at 22:42
  • The [subprocess](https://cran.r-project.org/package=subprocess) will do want you want, and for an example, see the vignette for the package, https://github.com/lbartnik/subprocess – Peter May 11 '18 at 19:13

1 Answers1

0

My recommendation would to create an individual .Rnw for each of the major tasks, knit them to .tex files and then use \include or \input in a parent.Rnw file to build the full project. Control the building of the project via a makefile.

However, to address this specific question, using a fresh R session for each chunk, you could use the R package subprocess to spawn a R session, run the needed code, extract the results, and then kill the spawned session.

A simple example .Rnw file

\documentclass{article}
\usepackage{fullpage}
\begin{document}

<<include = FALSE>>=
knitr::opts_chunk$set(collapse = FALSE)
@

<<>>=
library(subprocess)

# define a function to identify the R binary
R_binary <- function () {
  R_exe <- ifelse (tolower(.Platform$OS.type) == "windows", "R.exe", "R")
  return(file.path(R.home("bin"), R_exe))
}
@


<<>>=
# Start a subprocess running vanilla R.
subR <- subprocess::spawn_process(R_binary(), c("--vanilla --quiet"))
Sys.sleep(2) # wait for the process to spawn

# write to the process
subprocess::process_write(subR, "y <- rnorm(100, mean = 2)\n")
subprocess::process_write(subR,  "summary(y)\n")

# read from the process
subprocess::process_read(subR, PIPE_STDOUT)

# kill the process before moving on.
subprocess::process_kill(subR)
@


<<>>=
print(sessionInfo(), local = FALSE)
@

\end{document}

Generates the following pdf:

enter image description here enter image description here

Peter
  • 7,460
  • 2
  • 47
  • 68