4

I have been writing code using R reference classes. However, as I have progressed, the program has become intolerably slow. To demonstrate the problem, take the following example:

myClass <- setRefClass(
  "Class"="myClass",
  fields=c(
    counter="numeric"
  ),
  methods=list(
    initialize=function () {
      counter <<- 0
    },
    donothing=function () {

    },
    rand=function () {
      return(runif(1))
    },
    increment=function () {
      counter <<- counter + 1
    }
  )
)

mc <- myClass()
system.time(for (it in 1:500000) {
  mc$donothing()
})
system.time(for (it in 1:500000) {
  mc$rand()
})
system.time(for (it in 1:500000) {
  mc$increment()
})

It takes:

  • 4s for calls to a method
  • 7s for calls to generate a random number
  • 19s to increment a field value

It's the last result that's causing me problems. I obviously don't expect it to take twice as long to increment a number than to generate a random number. My code involves a lot of accessing and changing of field values in a reference class, and this performance issue has made the program all but usable.

My question: is there anything I can do to improve the performance of field lookup/access in R reference classes? Is there anything I should be doing differently?

csgillespie
  • 59,189
  • 14
  • 150
  • 185
Brendon
  • 848
  • 8
  • 24
  • I don't know, but I'm suspicious of the `<<-`. Is that really the right way to increment a reference class field ?? – Ben Bolker Feb 10 '14 at 03:06
  • @BenBolker I believe so, the documentation page says "Note that non-local assignment is required". I can't see any other obvious way of doing it. Reading the [assignment documentation](https://stat.ethz.ch/R-manual/R-devel/library/base/html/assignOps.html), it says `<<-` causes "a search to made through parent environments for an existing definition of the variable being assigned" I suspect this is where the performance hit lies. – Brendon Feb 10 '14 at 07:19
  • You don't have to use non-local assignemtns as you can use `.self$` or `.self$field()`. But that doesn't really speed up things either. Good question! – Rappster Mar 12 '14 at 19:30

1 Answers1

3

It seems a major performance issue was due to providing class names in the fields argument. If I replace

fields=c(
    counter="numeric"
),

with

fields=c("counter")

the calculation completes in 5s, compared to 19s. It is difficult to determine from the documentation why the performance penalty is so great -- perhaps it is due to checking of classes during assignment. The documentation mentions the following:

In particular, fields with a specified class are implemented as a special form of active binding to enforce valid assignment to the field

I'm not too sure what 'active binding' is, but I assume it introduces some pre-assignment logic.

Brendon
  • 848
  • 8
  • 24