R: Where are formals for a function stored in memory?

Question

When a function has been defined but has not yet been called, do the formals that do not have default values exist? If they do, do they exist in the execution environment, or in the environment where the function definition is located, or somewhere else?

If a function has been defined but not yet called, and a formal has been assigned a default value, does that value exist? If it does, in what environment does it exist? If the default expression evaluates to a constant, has the formal been assigned to that value, to be overwritten when the function is called if a value is supplied? If not, in what environment is that (fixed) default value located between the moment of definition and the time the function is called?

After the function has been called and actual or default values have been assigned to the formals, passed into the body, and if necessary scoped and/or evaluated, do the formals continue to exist? If so, in what environment do they then exist?

To me it seems clear what the question is asking. In essence: if you have something like `fun <- function(x=2)` does the `x` exist in some kind of environment? And the question is about variations of that sort. — Karolis Koncevičius, Apr 29 '18 at 12:29
@HongOoi - Andrew's question makes sense in the context of John Chambers' quote: "To understand computations in R, two slogans are helpful: 1) Everything that exists is an object, and 2) Everything that happens is a function call." If everything in R is an object, the formals of a function must also be objects. Therefore, it's reasonable to ask questions about the environment(s) in which they exist. — Len Greski, Apr 29 '18 at 17:59

Len Greski · Answer 1 · 2018-05-02T11:13:45.360

4

The formals for a function exist as objects within the environment of a function once an instance of the function is loaded into memory by being called. In Advanced R, Hadley Wickham calls this environment the execution environment. The memory locations of the objects can be accessed via pryr::address().

As an example I'll use a modified version of code that I previously wrote to illustrate memory locations in the makeVector() function from the second programming assignment for the Johns Hopkins R Programming course on coursera.org.

makeVector <- function(x = 200) {
     library(pryr)
     message(paste("Address of x argument is:",address(x)))
     message(paste("Number of references to x is:",refs(x)))
     m <- NULL
     set <- function(y) {
          x <<- y
          message(paste("set() address of x is:",address(x)))
          message(paste("Number of references to x is:",refs(x)))
          m <<- NULL
     }
     get <- function() x
     setmean <- function(mean) m <<- mean
     getmean <- function() m
     list(set = set, get = get,
          setmean = setmean,
          getmean = getmean)
}

As coded above, makeVector() is an S3 object, which means we can access objects within its environment via getters and setters, also known as mutator methods.

We can load an instance of the makeVector() object into memory and query the address and value of x with the following code.

makeVector()$get()

...and the result:

> makeVector()$get()
Address of x argument is: 0x1103df4e0
Number of references to x is: 0
[1] 200
>

As we can see from the output, x does have a memory location, but there are no other objects that contain references to it. Also, x was set to its default value of a vector of length 1 with the value 200.

I provide a detailed walkthrough of the objects in the makeVector() environment in my answer to Caching the Mean of a Vector in R.

Regarding the question about how long the formals exist in memory, they exist as long as the environment created to store the called instance of the function is in memory. Since the garbage collector operates on objects that have no external references, if the function instance is not saved to an object, it is eligible for garbage collection as soon as the function call returns a result to the parent environment.

edited May 02 '18 at 11:13

answered Apr 29 '18 at 14:29

Len Greski

10,505
2
22
33

this answer is very clear and delightfully edifying. However, I am still uncertain about a few details of it. (1) Is "the environment of a function once it is loaded into memory" the same environment as the execution environment? Or the calling or enclosing environment? Or something else? I have been imagining that the formals and defaults are the function's interface to the outside world and sit in their own "in-between" environment, but I am sure this is wrong. Almost sure.(2) is the address of x the same as for makeVector? And (continued) – andrewH Apr 29 '18 at 18:32
(3) Does the assignment of 200 to 0x1103df4e0 occur when the function is defined, when it is called, or some time in between? – andrewH Apr 29 '18 at 18:34
@andrewH - when a function is called, the R Evaluator creates a new environment for the function, and initializes "promise objects" for each argument listed in the formals section of the function. These objects are stored within the newly created environment for the function, which is a child of the environment in which the function was created. When arguments need to be evaluated, they will be evaluated in the environment from which the call came. Any arguments not assigned values in this process are set to their default values (Software for Data Analysis, Kindle Edition, location 809+). – Len Greski Apr 29 '18 at 19:02
@andrewH - (2) the address (i.e. location in physical memory) of `x` is different than the address of `makeVector()`. One can check this by saving the result of `makeVector()` to an object, executing `pryr::address()` on it, and comparing that address to the address returned by `$get()`. (3) as stated above, the assignment of 200 occurs when the function is called. If I've answered your question, please accept the answer. – Len Greski Apr 29 '18 at 19:08
1

re your first reply: so this is the environment they call the execution environment, yes? Sorry if I am beating this into the ground. I just want to be sure I can connect what I am learning from you with what I thought I think I have learned from Hadley's "Advanced R". re your second reply: Will do. I read your Caching the Mean post. Very nice piece. – andrewH Apr 29 '18 at 23:13
@andrewH - Yes, this is the environment that Wickham calls the execution environment in the [Environments chapter](http://adv-r.had.co.nz/Environments.html) of *Advanced R*. Thanks for letting me know that you're reading it so I could get the right cross reference for you. Also, no apology needed. Please ask as many follow up questions as you need to fully comprehend the concepts. Also, thanks for the feedback about my original article. – Len Greski Apr 29 '18 at 23:29
@LenGreski, stylistic question: the use of `require` at the beginning will not do much: if it returns `FALSE` your function will continue unabated (and fail with the first `object not found`). Is there rationale behind (1) not checking the return value, or (2) not using `library`? Or is the purpose more explicit-intent and self-documenting code? – r2evans Apr 30 '18 at 16:08
@r2evans - My intent was (2), but I'll update the code to include error checking `require()` and a `library()` call. – Len Greski May 01 '18 at 01:36
I don't understand how your new code is any better than just `library(pryr)`. If `require(pryr)` returns `F`, your call to `library` will certainly fail and stop; if it returns `T`, then `library` can never be called, but the end result is the same: the namespace is loaded and attached to the search path, etc. – r2evans May 01 '18 at 01:44
@r2evans - had a bit of a brain cramp there, the function should `stop()` if `require()` fails. – Len Greski May 01 '18 at 02:04
1

@r2evans - after reading Yihui Xie's [blog post](https://yihui.name/en/2014/07/library-vs-require/) on `library()` vs. `require()`, I agree that `library()` is more appropriate than `require()`, even though the R documentation states that `require()` is intended to be used within R functions. Thanks for your persistence in asking me about this. – Len Greski May 02 '18 at 11:17
1

Not that I spend hours pontificating over this topic, but ... I think the only practical use of `require` is when you have two code paths: one when a package is available, another otherwise, perhaps due to speed/efficiency, perhaps added capabilities. With `library`, the second path is hard-coded as `stop`. (And `if (!require(...)) install.packages()` should generally not be used in production, perhaps only in tutorials/demos.) I've cited that post before, btw, it's well-written. – r2evans May 02 '18 at 15:08

R: Where are formals for a function stored in memory?

1 Answers1