1

How does lapply extract sub-elements from a list? More specifically, how does lapply extract sub-elements from a list of lists versus a list of vectors? Even more specifically, suppose I have the following:

my_list_of_lists <- list(list(a = 1, b = 2), list(a = 2, c = 3), list(b = 4, c = 5))
my_list_of_lists[[1]][["a"]] # just checking
# [1] 1 
# that's what I expected

and apply the following:

lapply(my_list_of_lists, function(x) x[["a"]]) 
# [[1]]
# [1] 1
# 
# [[2]]
# [1] 2
# 
# [[3]]
# NULL

So lapply extracts the a element from each of the 3 sublists, returning each in its own list, contained in the length=3 list. At this point, my mental model is the following: lapply applies FUN to each element of my_list, returning FUN(my_list[[i]]) for i in 1:3. Great! So I expect my mental model should work for lists of vectors as well. For example,

my_list_of_vecs <- list(c(a = 1, b = 2), c(a = 2, c = 3), c(b = 4, c = 5))
my_list_of_vecs[[1]][["a"]] # Just checking
# [1] 1
# that's what I expected

and apply the following:

lapply(my_list_of_vecs, function(x) x[["a"]]) 
# Error in x[["a"]] : subscript out of bounds
# Wait...What!?

What's going on here!? Shouldn't this just work? I found a section in help(lapply) which might be relevant:

For historical reasons, the calls created by lapply are unevaluated, and code has been written (e.g., bquote) that relies on this. This means that the recorded call is always of the form FUN(X[[i]], ...), with i replaced by the current (integer or double) index. This is not normally a problem, but it can be if FUN uses sys.call or match.call or if it is a primitive function that makes use of the call. This means that it is often safer to call primitive functions with a wrapper, so that e.g. lapply(ll, function(x) is.numeric(x)) is required to ensure that method dispatch for is.numeric occurs correctly.

I really don't know how to make sense of this.

I think it's related to the fact that you can use both [[ and [ extraction of single elements from a vector but you can ONLY use [ extraction of ranges of elements. For example,

my_list_of_vecs[[1]][1:2]
# a b 
# 1 2
my_list_of_vecs[[1]][[1:2]]
# Error in my_list_of_vecs[[1]][[1:2]] : 
#   attempt to select more than one element in vectorIndex

So under the hood, lapply must be using function(x) x[["a"]] over a range. Is that right?

Debugging doesn't help me here since these functions rely on .Internal functions.

lowndrul
  • 3,715
  • 7
  • 36
  • 54
  • 1
    This is not a `lapply` issue. It's an issue with the way `[[` works when you pass it an element name that doesn't exist. Try `my_list_of_lists[[3]][["a"]]` – iod Nov 05 '18 at 01:22
  • 1
    Yes. That's the problem. If you had `my_list_of_vecs <- list(c(a = 1, b = 2), c(a = 2, c = 3), c(a = 4, c = 5))` for instance, `lapply(my_list_of_vecs, function(x) x[["a"]])` would return a list as it did with the list of lists. – prosoitos Nov 05 '18 at 01:30
  • @iod I assume you mean `my_list_of_vecs[[3]][["a"]]`. Yup. I see. When asking for non-existent element names, lists return `NULL` and vectors return errors. – lowndrul Nov 05 '18 at 02:51
  • Yep, sorry, my bad. – iod Nov 05 '18 at 04:05

0 Answers0