0

Modelling with glmmTMB produces modelling objects that become unwieldy to work with because of their size after saving to RDS.

Another thread suggested for lm objects this problem arises from environment attributes that seem to come along for the ride when saving these objects but of course these modeling objects are structured differently.
saveRDS inflating size of object

Update: In doing a little more digging, I found this thread in which it seems that perhaps the saveRDS refhook argument could be useful in preventing environments from being written to objects, but I don't understand how this argument works or how to structure a refhook function.

Here is a small reproducible example demonstrating how the object grows (by about 100 kb) after saving it out to RDS. However, in the much larger models I am running the size swells from 2 MB to over 1 GB.

# data frame from dput
DF <- structure(list(loc = c("300", "300", "300", "300", "300", "300", 
                             "301", "301", "301", "301", "301", "302", "302", "302", "302", 
                             "302", "302", "303", "303", "304", "304", "304", "305", "305", 
                             "307", "307", "308", "308", "309", "309", "310", "310", "312", 
                             "313", "315", "317", "318", "319"), num_pts = c(100L, 100L, 100L, 
                                                                             100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 
                                                                             100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 
                                                                             100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 
                                                                             100L, 100L), n = c(0, 1, 1, 0, 5, 1, 7, 5, 4, 2, 15, 1, 0, 1, 
                                                                                                5, 5, 8, 4, 2, 5, 10, 6, 14, 10, 9, 10, 7, 10, 0, 0, 14, 16, 
                                                                                                0, 0, 6, 2, 10, 2), yr = structure(c(27L, 28L, 29L, 30L, 31L, 
                                                                                                                                     32L, 27L, 28L, 29L, 30L, 31L, 27L, 28L, 29L, 30L, 31L, 32L, 27L, 
                                                                                                                                     31L, 27L, 31L, 32L, 27L, 31L, 28L, 32L, 28L, 32L, 28L, 32L, 28L, 
                                                                                                                                     32L, 29L, 29L, 29L, 29L, 30L, 30L), .Label = c("1984", "1985", 
                                                                                                                                                                                    "1986", "1987", "1988", "1990", "1993", "1994", "1995", "1996", 
                                                                                                                                                                                    "1997", "1998", "1999", "2000", "2001", "2002", "2003", "2004", 
                                                                                                                                                                                    "2005", "2006", "2007", "2008", "2009", "2010", "2011", "2012", 
                                                                                                                                                                                    "2013", "2014", "2015", "2016", "2017", "2018"), class = "factor"), 
                     var = c(154.341666666667, 154.341666666667, 154.341666666667, 
                             154.341666666667, 154.341666666667, 154.341666666667, 149.208333333333, 
                             149.208333333333, 149.208333333333, 149.208333333333, 149.208333333333, 
                             136.025, 136.025, 136.025, 136.025, 136.025, 136.025, 150.125, 
                             150.125, 169.375, 169.375, 169.375, 156.891666666667, 156.891666666667, 
                             148.716666666667, 148.716666666667, 150.533333333333, 150.533333333333, 
                             155.2, 155.2, 150.033333333333, 150.033333333333, 152.275, 
                             155.266666666667, 155.7, 149.358333333333, 146.925, 147.575
                     )), row.names = 1912:1949, class = "data.frame")

# form model
model <- as.formula("cbind(n, num_pts - n) ~ var + (1 | yr) + (1 | loc)")

# fit
fit <- 
  glmmTMB::glmmTMB(model,
                   family = "betabinomial",
                   data=DF)


pryr::object_size(fit)  # 530kb

saveRDS(fit, "fit.rds")

fit2 <- readRDS("fit.rds")

pryr::object_size(fit2) # 640kb

Thanks

user1658170
  • 814
  • 2
  • 14
  • 24
  • I know you said "I can't really provide a reproducible example", but why not? Can you write a code example that simulates data with similar dimensions to your problem that demonstrates this phenomenon? `library(glmmTMB); m1 <- glmmTMB(count ~ mined + (1|site), zi=~mined, family=poisson, data=Salamanders); object.size(m1); saveRDS(m1,"tmp.rds"); file.size("tmp.rds")` shows that this is not always a problem ... It would help a lot in trying to figure out what the problem is. – Ben Bolker Oct 09 '20 at 18:28
  • Thanks @BenBolker Added a small example that shows how the object inflates. – user1658170 Oct 09 '20 at 21:06

0 Answers0