5

Consider:

x <- 5
replicate(10, x <- x + 1)

This has output c(6, 6, 6, 6, 6, 6, 6, 6, 6, 6). However:

x <- 5
replicate(10, x <<- x + 1)

has output c(6, 7, 8, 9, 10, 11, 12, 13, 14, 15).

What does this imply about the environment that x <- x + 1 is evaluated in? Am I to believe that x is treated as if it is an internal variable for replicate? That appears to be what I'm seeing, but when I consulted the relevant section of the language definition, I saw the following:

It is also worth noting that the effect of foo(x <- y) if the argument is evaluated is to change the value of x in the calling environment and not in the evaluation environment of foo.

But if x really was changed in the calling environment, then why does:

x <- 5
replicate(10, x <- x + 1)
x

Return 5 and not 15? What part have I misunderstood?

J. Mini
  • 1,868
  • 1
  • 9
  • 38

1 Answers1

5

The sentence you quoted from the language definition is about standard evaluation, but replicate uses non-standard evaluation. Here's its source:

replicate <- function (n, expr, simplify = "array") 
sapply(integer(n), eval.parent(substitute(function(...) expr)), 
    simplify = simplify)

The substitute(function(...) expr) call takes your expression x <- x + 1 without evaluating it, and creates a new function

function(...) x <- x + 1

That's the function that gets passed to sapply(), which applies it to a vector of length n. So all the assignments take place in the frame of that anonymous function.

When you use x <<- x + 1, the evaluation still takes place in the constructed function, but its environment is the calling environment to replicate() (because of the eval.parent call), and that's where the assignment happens. That's why you get the increasing values in the output.

So I think you understood the manual correctly, but it didn't make clear it was talking there about the case of standard evaluation. The following paragraph hints at what's happening here:

It is possible to access the actual (not default) expressions used as arguments inside the function. The mechanism is implemented via promises. When a function is being evaluated the actual expression used as an argument is stored in the promise together with a pointer to the environment the function was called from. When (if) the argument is evaluated the stored expression is evaluated in the environment that the function was called from. Since only a pointer to the environment is used any changes made to that environment will be in effect during this evaluation. The resulting value is then also stored in a separate spot in the promise. Subsequent evaluations retrieve this stored value (a second evaluation is not carried out). Access to the unevaluated expression is also available using substitute.

but the help page for replicate() doesn't make clear this is what it's doing.

BTW, your title asks about apply family functions: but most of them other than replicate ask explicitly for a function, so this issue doesn't arise there. For example, it's obvious that this doesn't affect the global x:

sapply(integer(10), function(i) x <- x + 1)
user2554330
  • 37,248
  • 4
  • 43
  • 90
  • Why am I starting to lose faith in R's documentation? You've said that the docs for `replicate` don't make what it's doing clear, but I can't see those docs mentioning it at all. There's a small bit in the "Note" section, but it doesn't seem relevant. Just to be clear: Do the docs for `replicate` say what is being done but say it badly, or do they not say at all? – J. Mini Feb 22 '21 at 19:33
  • Answering your questions in order: 1. Because you have unrealistic expectations. 2. Neither one. They hint at it by saying `replicate` is a wrapper for `sapply`, but that's it. `sapply` only works on functions, so the hint tells you there's got to be one created somewhere, but you need to look at the source to figure out the details. Luckily that's easy, since `replicate` is pure R code. The source to functions like `lapply` is a lot harder to look at, since it's mostly C code. – user2554330 Feb 23 '21 at 13:55