I think your problem is related to scope or namespace. Namely, when in a function and a variable is referenced that has not been defined locally in that function, R starts searching in the parent "frame" (environment where its variables are defined); if not there, then it goes to the parent's parent frame (grand-parent frame?); etc. (One good read for this is Advanced R: Environments; an extra read might be the same book's chapter on Memory.)
It's helpful to look at the environment
being used/searched at any given time. I'll focus on the current, parent, and when inside the function, the "grand-parent" environments; realize, though, that deeply nested functions may have many more (which suggests you need to be very careful when depending on R to hunt-down and find the specific instance of a variable not in the local environment!).
NB: you will very likely not get the same <environment: 0x000...>
pointers. These references are completely unreproducible and change each time this code is run.
Let's start with the lapply
setup that works:
print(environment())
# <environment: R_GlobalEnv>
nl1 <- lapply(1:2, function(i) {
e1 <- environment()
str(list(where="inside lapply", env=e1, parent=parent.env(e1)))
function(one,two) {
e2 <- environment()
str(list(where="inside func", env=e2, parent=parent.env(e2),
grandparent=parent.env(parent.env(e2))))
c(one, two, i)
}
})
# List of 3
# $ where : chr "inside lapply"
# $ env :<environment: 0x0000000009128fe0>
# $ parent:<environment: R_GlobalEnv>
# List of 3
# $ where : chr "inside lapply"
# $ env :<environment: 0x00000000090bb578>
# $ parent:<environment: R_GlobalEnv>
First notice that with each iteration within lapply
, there is a new environment, starting with 9128fe0
, whose parent is the global env. Within the second iteration of the lapply
, we are in 90bb578
, and within that environment, we define the function(one,two)
whose local environment is 8f811b8
(which we see in the next code block).
Realize that at this time, R has not attempted to resolve i
. Let's run a function:
nl1[[2]](11,12)
# List of 4
# $ where : chr "inside func"
# $ env :<environment: 0x0000000008f811b8>
# $ parent :<environment: 0x00000000090bb578>
# $ grandparent:<environment: R_GlobalEnv>
# [1] 11 12 2
So when we reference i
, R searches in the following, in order, to find it:
8f811b8
: inside function(one,two)...
, not found
90bb578
: immediate parent env, inside function(i) ...
; found
R_GlobalEnv
(not searched, since it was found previously)
Okay, let's try the for
loop:
nl2 <- list()
for (i in 1:2) {
e1 <- environment()
str(list(where="inside for", env=e1, parent=parent.env(e1)))
nl2[[i]] <- function(one,two) {
e2 <- environment()
str(list(where="inside func", env=e2, parent=parent.env(e2),
grandparent=parent.env(parent.env(e2))))
c(one, two, i)
}
}
# List of 3
# $ where : chr "inside for"
# $ env :<environment: R_GlobalEnv>
# $ parent:<environment: package:tcltk>
# ..- attr(*, "name")= chr "package:tcltk"
# ..- attr(*, "path")= chr "c:/R/R-3.3.3/library/tcltk"
# List of 3
# $ where : chr "inside for"
# $ env :<environment: R_GlobalEnv>
# $ parent:<environment: package:tcltk>
# ..- attr(*, "name")= chr "package:tcltk"
# ..- attr(*, "path")= chr "c:/R/R-3.3.3/library/tcltk"
First thing to notice is that within each iteration of the for
loop, the local environment is R_GlobalEnv
, which should make sense. (You can safely ignore the reference to the tcltk
environment as the parent.)
Okay, now when we get to the nl2[[1]]
call, notice that the parent environment is (perhaps now, not surprisingly) the R_GlobalEnv
environment:
nl2[[1]](11,12)
# List of 4
# $ where : chr "inside func"
# $ env :<environment: 0x000000001b1a6720>
# $ parent :<environment: R_GlobalEnv>
# $ grandparent:<environment: package:tcltk>
# ..- attr(*, "name")= chr "package:tcltk"
# ..- attr(*, "path")= chr "c:/R/R-3.3.3/library/tcltk"
# [1] 11 12 2
This was the first time that R needed to find i
, so it first searched within 1b1a6720
(within function(one,two)
, where it was not found), and then in the R_GlobalEnv
.
So why did it return "2"?
Because the value of i
in R_GlobalEnv
is, at the time we called nl2[[2]]
, the last value of i
in the for
loop. See this:
rm(i)
for (i in 1:100) { } # no-op
i
# [1] 100
What's even more telling is if we try to call the function now:
nl2[[1]](11,12)
# List of 4
# $ where : chr "inside func"
# $ env :<environment: 0x000000000712c2a0>
# $ parent :<environment: R_GlobalEnv>
# $ grandparent:<environment: package:tcltk>
# ..- attr(*, "name")= chr "package:tcltk"
# ..- attr(*, "path")= chr "c:/R/R-3.3.3/library/tcltk"
# [1] 11 12 100
So the evaluation of i
within that function is lazy in that it searches when you call the function.
In your environment (before you change any code), if you typed in i <- 100
, you would see similar behavior.
If you are absolutely against using lapply
(which is my preferred method here, even if I don't understand your underlying need here), try explicitly defining the environment that surrounds your function. One way is to use local
, which will preserve searching within existing parent environments while allowing us to "force" which i
we want used. (Other options exist, I invite others to comment and for you to explore environments more.)
nl3 <- list()
for (i in 1:2) {
e1 <- environment()
str(list(where="inside for", env=e1, parent=parent.env(e1)))
nl3[[i]] <- local({
i <- i # forces it locally within this env
function(one,two) {
e2 <- environment()
str(list(where="inside func", env=e2, parent=parent.env(e2),
grandparent=parent.env(parent.env(e2))))
c(one, two, i)
}
})
}
# List of 3
# $ where : chr "inside for"
# $ env :<environment: R_GlobalEnv>
# $ parent:<environment: package:tcltk>
# ..- attr(*, "name")= chr "package:tcltk"
# ..- attr(*, "path")= chr "c:/R/R-3.3.3/library/tcltk"
# List of 3
# $ where : chr "inside for"
# $ env :<environment: R_GlobalEnv>
# $ parent:<environment: package:tcltk>
# ..- attr(*, "name")= chr "package:tcltk"
# ..- attr(*, "path")= chr "c:/R/R-3.3.3/library/tcltk"
nl3[[1]](11,12)
# List of 4
# $ where : chr "inside func"
# $ env :<environment: 0x0000000019ca23e0>
# $ parent :<environment: 0x000000001aabe388>
# $ grandparent:<environment: R_GlobalEnv>
# [1] 11 12 1
i <- 1000
nl3[[1]](11,12)
# List of 4
# $ where : chr "inside func"
# $ env :<environment: 0x0000000008d0bc78>
# $ parent :<environment: 0x000000001aabe388>
# $ grandparent:<environment: R_GlobalEnv>
# [1] 11 12 1
(You may notice that the local environment when you call the function changes each time while the parent does not. This is because when you call a function, it starts at the beginning of the function's call with a new environment. You "know" and rely on this because you assume that at the beginning of your function, no variables are defined. This is normal.)