Why modelling in R consumes more memory than the returned object

Question

When performing survival analysis in R, fitting a model is reported to consume more memory, than the actual object being returned. Moreover, this seems to happen only a few times, not for every case.

require(survival)
require(pryr)
require(tidyverse)

dat <- tibble(
  x = sample(letters[1:2], 1e5, replace = TRUE),
  x2 = sample(LETTERS[1:2], 1e5, replace = TRUE),
  e = sample(0:1, 1e5, replace = TRUE),
  t = rweibull(1e5, shape = 1)
)

mem_change(fit <- survfit(formula = Surv(t, e) ~ x, data = dat))
mem_change(fit2 <- survfit(formula = Surv(t, e) ~ x, data = dat))
mem_change(fit3 <- survfit(formula = Surv(t, e) ~ 1, data = dat))
mem_change(fit4 <- survfit(formula = Surv(t, e) ~ x2, data = dat))
mem_change(fit5 <- survfit(formula = Surv(t, e) ~ x + x2, data = dat))

map(list(fit, fit2, fit3, fit4, fit5), object_size)
object_size(fit, fit2, fit3, fit4, fit5)

In case of fit and fit5 , pryr::mem_change() will report a change of ~ 7.5 MB, while each fitX object has 6.4 MB, as reported by pryr::object_size(). Are there any hidden variables created elsewhere, or is it somehow related to C implementation under the hood of survfit?

Edit: I'm aware, that the actual modelling process may consume more memory temporarily. However, pryr::mem_change() is assumed to return the net change in used memory, after all the computations have been finished, and all temporary objects have been discarded.

Why modelling in R consumes more memory than the returned object

0 Answers0