1

I have the following code in R:

named_list = list()
for (i in 1:5){
named_list[[i]] = function(one,two){c(one,two, i)}
}

However, when I call the function:

> named_list[[1]]("first", "second")
[1] "first"  "second" "5"

Is there a way to get this to work properly (to return "first", "second", "1") without using the apply functions? I have tried to use the force function as recommended in another thread, but I cannot get it to work.

Thanks.

Edit: For some clarification, I am looking to make a list of functions, each of which encloses the index of where that function is in that list. In particular, observe that

> named_list[[1]]("first", "second")
[1] "first"  "second" "5"

> named_list[[2]]("first", "second")
[1] "first"  "second" "5"

> named_list[[3]]("first", "second")
[1] "first"  "second" "5"

> named_list[[4]]("first", "second")
[1] "first"  "second" "5"

> named_list[[5]]("first", "second")
[1] "first"  "second" "5"

which is obviously not the desired behaviour. The problem is that looping i through 1 to 5, R sees the first 'i' indexing the named_list, but doesn't see the second 'i' which is inside the function I am trying to define.

I am aware that the following is a possible solution (although I do not know why it works):

named_list = lapply(1:5, function(i) function(one,two)(c(one,two,i)))

but I want to know if there is an alternative solution that uses the for loop.

Josh
  • 13
  • 3
  • Why is this "lazy evaluation"? Is there a reason you cannot define `named_list<-function(one,two,i=1)`; or perhaps `nl<-function(i){function(one,two)...}` and use `named_list(1)(one,two)`? I guess I don't understand (1) the intent of your list of functions; (2) your aversion to `*apply`; and (3) your aversion to a more classical function definition. – r2evans Feb 05 '18 at 05:58
  • Please see my edit, which I hope clarifies some of your questions. – Josh Feb 05 '18 at 11:55

2 Answers2

5

I think your problem is related to scope or namespace. Namely, when in a function and a variable is referenced that has not been defined locally in that function, R starts searching in the parent "frame" (environment where its variables are defined); if not there, then it goes to the parent's parent frame (grand-parent frame?); etc. (One good read for this is Advanced R: Environments; an extra read might be the same book's chapter on Memory.)

It's helpful to look at the environment being used/searched at any given time. I'll focus on the current, parent, and when inside the function, the "grand-parent" environments; realize, though, that deeply nested functions may have many more (which suggests you need to be very careful when depending on R to hunt-down and find the specific instance of a variable not in the local environment!).

NB: you will very likely not get the same <environment: 0x000...> pointers. These references are completely unreproducible and change each time this code is run.


Let's start with the lapply setup that works:

print(environment())
# <environment: R_GlobalEnv>
nl1 <- lapply(1:2, function(i) {
  e1 <- environment()
  str(list(where="inside lapply", env=e1, parent=parent.env(e1)))
  function(one,two) {
    e2 <- environment()
    str(list(where="inside func", env=e2, parent=parent.env(e2),
             grandparent=parent.env(parent.env(e2))))
    c(one, two, i)
  }
})
# List of 3
#  $ where : chr "inside lapply"
#  $ env   :<environment: 0x0000000009128fe0> 
#  $ parent:<environment: R_GlobalEnv> 
# List of 3
#  $ where : chr "inside lapply"
#  $ env   :<environment: 0x00000000090bb578> 
#  $ parent:<environment: R_GlobalEnv> 

First notice that with each iteration within lapply, there is a new environment, starting with 9128fe0, whose parent is the global env. Within the second iteration of the lapply, we are in 90bb578, and within that environment, we define the function(one,two) whose local environment is 8f811b8 (which we see in the next code block).

Realize that at this time, R has not attempted to resolve i. Let's run a function:

nl1[[2]](11,12)
# List of 4
#  $ where      : chr "inside func"
#  $ env        :<environment: 0x0000000008f811b8> 
#  $ parent     :<environment: 0x00000000090bb578> 
#  $ grandparent:<environment: R_GlobalEnv> 
# [1] 11 12  2

So when we reference i, R searches in the following, in order, to find it:

  • 8f811b8: inside function(one,two)..., not found
  • 90bb578: immediate parent env, inside function(i) ...; found
  • R_GlobalEnv (not searched, since it was found previously)

Okay, let's try the for loop:

nl2 <- list()
for (i in 1:2) {
  e1 <- environment()
  str(list(where="inside for", env=e1, parent=parent.env(e1)))
  nl2[[i]] <- function(one,two) {
    e2 <- environment()
    str(list(where="inside func", env=e2, parent=parent.env(e2),
             grandparent=parent.env(parent.env(e2))))
    c(one, two, i)
  }
}
# List of 3
#  $ where : chr "inside for"
#  $ env   :<environment: R_GlobalEnv> 
#  $ parent:<environment: package:tcltk> 
#   ..- attr(*, "name")= chr "package:tcltk"
#   ..- attr(*, "path")= chr "c:/R/R-3.3.3/library/tcltk"
# List of 3
#  $ where : chr "inside for"
#  $ env   :<environment: R_GlobalEnv> 
#  $ parent:<environment: package:tcltk> 
#   ..- attr(*, "name")= chr "package:tcltk"
#   ..- attr(*, "path")= chr "c:/R/R-3.3.3/library/tcltk"

First thing to notice is that within each iteration of the for loop, the local environment is R_GlobalEnv, which should make sense. (You can safely ignore the reference to the tcltk environment as the parent.)

Okay, now when we get to the nl2[[1]] call, notice that the parent environment is (perhaps now, not surprisingly) the R_GlobalEnv environment:

nl2[[1]](11,12)
# List of 4
#  $ where      : chr "inside func"
#  $ env        :<environment: 0x000000001b1a6720> 
#  $ parent     :<environment: R_GlobalEnv> 
#  $ grandparent:<environment: package:tcltk> 
#   ..- attr(*, "name")= chr "package:tcltk"
#   ..- attr(*, "path")= chr "c:/R/R-3.3.3/library/tcltk"
# [1] 11 12  2

This was the first time that R needed to find i, so it first searched within 1b1a6720 (within function(one,two), where it was not found), and then in the R_GlobalEnv.

So why did it return "2"?

Because the value of i in R_GlobalEnv is, at the time we called nl2[[2]], the last value of i in the for loop. See this:

rm(i)
for (i in 1:100) { } # no-op
i
# [1] 100

What's even more telling is if we try to call the function now:

nl2[[1]](11,12)
# List of 4
#  $ where      : chr "inside func"
#  $ env        :<environment: 0x000000000712c2a0> 
#  $ parent     :<environment: R_GlobalEnv> 
#  $ grandparent:<environment: package:tcltk> 
#   ..- attr(*, "name")= chr "package:tcltk"
#   ..- attr(*, "path")= chr "c:/R/R-3.3.3/library/tcltk"
# [1]  11  12 100

So the evaluation of i within that function is lazy in that it searches when you call the function.

In your environment (before you change any code), if you typed in i <- 100, you would see similar behavior.


If you are absolutely against using lapply (which is my preferred method here, even if I don't understand your underlying need here), try explicitly defining the environment that surrounds your function. One way is to use local, which will preserve searching within existing parent environments while allowing us to "force" which i we want used. (Other options exist, I invite others to comment and for you to explore environments more.)

nl3 <- list()
for (i in 1:2) {
  e1 <- environment()
  str(list(where="inside for", env=e1, parent=parent.env(e1)))
  nl3[[i]] <- local({
    i <- i # forces it locally within this env
    function(one,two) {
      e2 <- environment()
      str(list(where="inside func", env=e2, parent=parent.env(e2),
               grandparent=parent.env(parent.env(e2))))
      c(one, two, i)
    }
  })
}
# List of 3
#  $ where : chr "inside for"
#  $ env   :<environment: R_GlobalEnv> 
#  $ parent:<environment: package:tcltk> 
#   ..- attr(*, "name")= chr "package:tcltk"
#   ..- attr(*, "path")= chr "c:/R/R-3.3.3/library/tcltk"
# List of 3
#  $ where : chr "inside for"
#  $ env   :<environment: R_GlobalEnv> 
#  $ parent:<environment: package:tcltk> 
#   ..- attr(*, "name")= chr "package:tcltk"
#   ..- attr(*, "path")= chr "c:/R/R-3.3.3/library/tcltk"
nl3[[1]](11,12)
# List of 4
#  $ where      : chr "inside func"
#  $ env        :<environment: 0x0000000019ca23e0> 
#  $ parent     :<environment: 0x000000001aabe388> 
#  $ grandparent:<environment: R_GlobalEnv> 
# [1] 11 12  1
i <- 1000
nl3[[1]](11,12)
# List of 4
#  $ where      : chr "inside func"
#  $ env        :<environment: 0x0000000008d0bc78> 
#  $ parent     :<environment: 0x000000001aabe388> 
#  $ grandparent:<environment: R_GlobalEnv> 
# [1] 11 12  1

(You may notice that the local environment when you call the function changes each time while the parent does not. This is because when you call a function, it starts at the beginning of the function's call with a new environment. You "know" and rely on this because you assume that at the beginning of your function, no variables are defined. This is normal.)

r2evans
  • 141,215
  • 6
  • 77
  • 149
  • Hey, not the original poster, but wanted to say great explanation! Could you elaborate on how the `local` function actually works? Does it act such that any script inside the `local` function will only look for variables inside those brackets? If so, how is it that `i <- i` will take the parent `i` and place it inside the local `i`? – LachlanO Feb 06 '18 at 00:52
  • It doesn't block scope search (try `local({ parent.env(environment()) })` to see that it reaches out), but it's a way to place specific *things* (variables, functions) into a manually-defined environment. The use of `i<-...` ensures there is a variable named "i" in the `local` environment; the fact that I set it to the value of `i` in the enclosing environment is coincidental. Now, technically, there are two such `i` objects, one in the `local` env, one in global, but the functions should always find the former one first. – r2evans Feb 06 '18 at 01:12
  • Ah, this makes sense. A LOT of sense actually. Thank you very much :) – LachlanO Feb 06 '18 at 01:48
1

Whenever I get into situations like this I decide to just write it out as text and wrap it inside an eval statement. Like so.

named_list = list()
for (i in 1:5){

  eval(parse(text = paste0("named_list[[i]] = function(one,two){c(one,two,", i, ")}")))

}

named_list[[1]]("first", "second")

Now I get

> named_list[[1]]("first", "second")
[1] "first"  "second" "1" 

As desired.

So all I did was make what I know I wanted to string to be in text and get it to evaluate it that way instead.

There's probably a better solution, but that will do the job for you.

LachlanO
  • 1,152
  • 8
  • 14
  • 1
    https://stackoverflow.com/questions/13649979/what-specifically-are-the-dangers-of-evalparse – Ronak Shah Feb 05 '18 at 05:09
  • I agree! It depends on context though, sometimes I use R a lot for experimentation. Often this is faster than working out the 'other' way to do things. If my code ever becomes something where speed is needed, that's when I jump out of the eval(parse()) world. I think it's a good get the job done tool, but as I say, there's probably a better solution :) – LachlanO Feb 05 '18 at 05:12