3

I am working on an R package where performance is important (there will be many iterations due to bootstrapping), and planned to use the R6 library since an object-oriented approach made most sense in my head. The alternative seems to be "semi-objects" made from lists. But the former seems slower than the latter, even when portable = FALSE and class = FALSE.

Are there ways to speed up R6? Or is lists the way to go?

Here is a simple comparison, but the difference between list and R6/R6None seems larger for more complex objects:

#     test replications elapsed relative user.self sys.self
# 3   list          100    4.80    1.000      4.45     0.34
# 1     R6          100    5.03    1.048      4.84     0.17
# 2 R6None          100    4.82    1.004      4.59     0.19

Code for the example:

library(data.table)

TestClassNone <- R6::R6Class("TestClassNone",
            portable = FALSE,
            class = FALSE,
            public = list(
              data = NULL,
              sd = NULL,
              sub = NULL,
              initialize = function(data) {
                data <<- data
                get_sd()
                get_sub()
              },
              get_sd = function() {
                sd <<- sd(data[, V1])
              },
              get_sub = function() {
                sub <<- data[1:10]
              }
            ))
TestClass <- R6::R6Class("TestClass",
     public = list(
       data = NULL,
       sd = NULL,
       sub = NULL,
       initialize = function(data) {
         self$data <- data
         self$get_sd()
         self$get_sub()
       },
       get_sd = function() {
         self$sd <- sd(self$data[, V1])
       },
       get_sub = function() {
         self$sub <- self$data[1:10]
       }
     ))

rbenchmark::benchmark("R6" = {
  data <- as.data.table(sample(5e5))
  test <- TestClass$new(data)
},
"R6None" = {
  data <- as.data.table(sample(5e5))
  test <- TestClassNone$new(data)
},
"list" = {
  data <- as.data.table(sample(5e5))
  test <- list(data = data)
  test$get_sd <- function(x) sd(x[["data"]][, V1])
  test$get_sub <- function(x) x[["data"]][1:10, ]
  test$sdd <- test$get_sd(test)
  test$sub <- test$get_sub(test)
},
replications = 100,
columns = c("test", "replications", "elapsed",
            "relative", "user.self", "sys.self"))
taffel
  • 133
  • 1
  • 5
  • 4
    I don't think R6 classes are going to be faster to _instantiate_, which is all your code is testing. The performance benefits normally come through preventing copies because of reference semantics, but that very much depends on how you are using your class. You're doing the right thing, which is to benchmark and go with what works best. – Allan Cameron Apr 11 '22 at 12:31
  • @AllanCameron Thank you for your answer, it makes sense! – taffel Apr 20 '22 at 22:17

0 Answers0