Let's say we have an arbitrary size large nested list (the level of depth may exceed 100). The list contains objects (up to several thousand) in a non-predefined location in the list that we will need to see and modify often. Therefore, we keep a separate variable with pointers to these objects in the list. We need to know what would be the fastest way on how to create the pointers.
So far, I could think of 4 different solutions with the code below:
First, we need to have a dummy nested list object for the sake of demonstration:
create_nested_list <- function(depth) {
myList = list()
if(depth > 0) {
depth <- depth - 1
myList$level <- paste0(paste0('depth_', depth))
myList[[paste0('Depth_', depth, '_A')]] <- create_nested_list(depth)
myList[[paste0('Depth_', depth, '_B')]] <- create_nested_list(depth) }
return(myList) }
myList <- create_nested_list(10)
Then, let's say we will want to see and often modify the following attribute in the list:
myList$Depth_9_B$Depth_8_A$Depth_7_A$Depth_6_A$Depth_5_A$Depth_4_A$Depth_3_A$Depth_2_A$level
The expression above would be a direct method to access the element in the list. However, it doesn't work for our case as the code above would create a copy of the object instead of a pointer.
The Base-R solution would be saving a path to the object in a string and evaluating the expression.
path <- '$Depth_9_B$Depth_8_A$Depth_7_A$Depth_6_A$Depth_5_A$Depth_4_A$Depth_3_A$Depth_2_A$level'
eval(str2lang(paste0('myList', path)))
- We can also use the library "pointr" to create the pointer object.
library(pointr)
ptr('pointer_to_the_object', 'myList$Depth_9_B$Depth_8_A$Depth_7_A$Depth_6_A$Depth_5_A$Depth_4_A$Depth_3_A$Depth_2_A$level')
pointer_to_the_object
Instead of using the S3 class object, we can use the R6/Reference class. But in that case each element in the list must be a separate S6 class object. We need to change the way how we create the base list.
library(R6) nestedR6 <- R6Class( 'myList', cloneable = FALSE, lock_objects = FALSE, public = list( ref_list = NULL, initialize = function(depth) { if(depth > 0) { depth <- depth - 1 self$level <- paste0(paste0('depth_', depth)) self[[paste0('Depth_', depth, '_A')]] <- nestedR6$new(depth) self[[paste0('Depth_', depth, '_B')]] <- nestedR6$new(depth) } } ) )
myListR6 <- nestedR6$new(10)
R6obj <- myListR6$Depth_9_B$Depth_8_A$Depth_7_A$Depth_6_A$Depth_5_A$Depth_4_A$Depth_3_A$Depth_2_A
Then, we can compare the speed of all 4 methods:
library(microbenchmark)
library(ggplot2)
mbm <- microbenchmark(direct = myList$Depth_9_B$Depth_8_A$Depth_7_A$Depth_6_A$Depth_5_A$Depth_4_A$Depth_3_A$Depth_2_A$level,
direct2 = myList[['Depth_9_B']][['Depth_8_A']][['Depth_7_A']][['Depth_6_A']][['Depth_5_A']][['Depth_4_A']][['Depth_3_A']][['Depth_2_A']][['level']],
eval_expression = {
eval(str2lang(paste0('myList', path)))
},
pointer = pointer_to_the_object,
R6_Class = R6obj[['level']],
times = 100)
autoplot(mbm)
Surprisingly, the access via the pointer object is the slowest one, and R6 class pointer is working even faster than direct access. Unfortunately, the R6 class is not the optimal solution as creating a nested list via R6 objects is significantly slower than S3.
microbenchmark(
S3 = create_nested_list(10),
S6 = nestedR6$new(10)
)