7

I have tried to figure out actual memory requirements for storing particular object. I tried two methods:

  • object.size(obj)
  • save(obj, file = "obj.Rdata") and checking the file size.

The .Rdata file is compressed so it was always smaller than what object.size() has returned, until I saw this object:

> object.size(out)
144792 bytes
> save(out, file = "out.Rdata")
# the file has 211 759 bytes

When I open the file in new R and run object.size(out), it reports 144792 bytes again.

Any idea how this can happen?

I don't want to post the complete object here since it contains closed data, but I can post the str output at least (it is the output of the R2jags::jags call - object of class rjags):

> str(out)
List of 6
 $ model             :List of 8
  ..$ ptr      :function ()  
  ..$ data     :function ()  
  ..$ model    :function ()  
  ..$ state    :function (internal = FALSE)  
  ..$ nchain   :function ()  
  ..$ iter     :function ()  
  ..$ sync     :function ()  
  ..$ recompile:function ()  
  ..- attr(*, "class")= chr "jags"
 $ BUGSoutput        :List of 24
  ..$ n.chains       : int 2
  ..$ n.iter         : num 1000
  ..$ n.burnin       : num 500
  ..$ n.thin         : num 1
  ..$ n.keep         : int 500
  ..$ n.sims         : int 1000
  ..$ sims.array     : num [1:500, 1:2, 1:5] -5.86e-06 -3.78e-02 6.92e-02 4.33e-02 4.34e-02 ...
  .. ..- attr(*, "dimnames")=List of 3
  .. .. ..$ : NULL
  .. .. ..$ : NULL
  .. .. ..$ : chr [1:5] "alpha" "beta" "deviance" "overdisp_sigma" ...
  ..$ sims.list      :List of 5
  .. ..$ alpha         : num [1:1000, 1] 0.04702 -0.00818 0.03757 0.00799 0.00369 ...
  .. ..$ beta          : num [1:1000, 1] -0.135 -0.2082 -0.0112 -0.129 -0.1613 ...
  .. ..$ deviance      : num [1:1000, 1] 16028 22052 16127 16057 16141 ...
  .. ..$ overdisp_sigma: num [1:1000, 1] 0.26506 0.00821 0.24998 0.25793 0.26013 ...
  .. ..$ yr_reff_sigma : num [1:1000, 1] 0.1581 0.176 0.0695 0.1052 0.1043 ...
  ..$ sims.matrix    : num [1:1000, 1:5] 0.04702 -0.00818 0.03757 0.00799 0.00369 ...
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : NULL
  .. .. ..$ : chr [1:5] "alpha" "beta" "deviance" "overdisp_sigma" ...
  ..$ summary        : num [1:5, 1:9] 3.16e-03 -1.20e-01 1.68e+04 2.29e-01 1.19e-01 ...
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : chr [1:5] "alpha" "beta" "deviance" "overdisp_sigma" ...
  .. .. ..$ : chr [1:9] "mean" "sd" "2.5%" "25%" ...
  ..$ mean           :List of 5
  .. ..$ alpha         : num [1(1d)] 0.00316
  .. ..$ beta          : num [1(1d)] -0.12
  .. ..$ deviance      : num [1(1d)] 16835
  .. ..$ overdisp_sigma: num [1(1d)] 0.229
  .. ..$ yr_reff_sigma : num [1(1d)] 0.119
  ..$ sd             :List of 5
  .. ..$ alpha         : num [1(1d)] 0.0403
  .. ..$ beta          : num [1(1d)] 0.0799
  .. ..$ deviance      : num [1(1d)] 2378
  .. ..$ overdisp_sigma: num [1(1d)] 0.0702
  .. ..$ yr_reff_sigma : num [1(1d)] 0.036
  ..$ median         :List of 5
  .. ..$ alpha         : num [1(1d)] 0.00399
  .. ..$ beta          : num [1(1d)] -0.123
  .. ..$ deviance      : num [1(1d)] 16209
  .. ..$ overdisp_sigma: num [1(1d)] 0.252
  .. ..$ yr_reff_sigma : num [1(1d)] 0.111
  ..$ root.short     : chr [1:5] "alpha" "beta" "deviance" "overdisp_sigma" ...
  ..$ long.short     :List of 5
  .. ..$ : int 1
  .. ..$ : int 2
  .. ..$ : int 3
  .. ..$ : int 4
  .. ..$ : int 5
  ..$ dimension.short: num [1:5] 0 0 0 0 0
  ..$ indexes.short  :List of 5
  .. ..$ : NULL
  .. ..$ : NULL
  .. ..$ : NULL
  .. ..$ : NULL
  .. ..$ : NULL
  ..$ last.values    :List of 2
  .. ..$ :List of 4
  .. .. ..$ alpha         : num [1(1d)] 0.0296
  .. .. ..$ beta          : num [1(1d)] -0.0964
  .. .. ..$ deviance      : num [1(1d)] 16113
  .. .. ..$ overdisp_sigma: num [1(1d)] 0.265
  .. ..$ :List of 4
  .. .. ..$ alpha         : num [1(1d)] 0.0334
  .. .. ..$ beta          : num [1(1d)] -0.228
  .. .. ..$ deviance      : num [1(1d)] 16139
  .. .. ..$ overdisp_sigma: num [1(1d)] 0.257
  ..$ program        : chr "jags"
  ..$ model.file     : chr "model.txt"
  ..$ isDIC          : logi TRUE
  ..$ DICbyR         : logi TRUE
  ..$ pD             : num 2830902
  ..$ DIC            : num 2847738
  ..- attr(*, "class")= chr "bugs"
 $ parameters.to.save: chr [1:5] "alpha" "beta" "overdisp_sigma" "yr_reff_sigma" ...
 $ model.file        : chr "model.txt"
 $ n.iter            : num 1000
 $ DIC               : logi TRUE
 - attr(*, "class")= chr "rjags"
Community
  • 1
  • 1
Tomas
  • 57,621
  • 49
  • 238
  • 373

1 Answers1

12

One way this can happen is if the object has an associated environment that needs saving with it if it is to make sense. This comes up most commonly in the context of "closures" (see here for one explanation).

Without a reproducible example (and without having used R2jags myself) I can't tell you whether that's what is going on in your case, but it at least seems plausible, given that: (a) closures seem to be the most common cause of this situation; (b) based on the output of str(out), your object seems to include a bunch of functions; and (c) it seems like this might be a useful way to organize a computation-heavy and possibly parallelizable procedure like MCMC.

## Define a function "f" that returns a closure, here assigned to the object "y"
f <- function() {
    x <- 1:1e6
    function() 2*x
}
y <- f()
environment(y)
# <environment: 0x0000000008409ab8>

object.size(y)
# 1216 bytes

save(y, file="out.Rdata")
file.info("out.Rdata")$size
# [1] 2128554
Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
  • That's a nice idea, but my object is not a function. It is an object (or list?). `environment(out)` returns NULL. – Tomas Feb 17 '14 at 18:54
  • It's a list that includes a bunch of functions among its elements. See, for example, the elements of `out$model`). I'd suggest checking **their** environments, with something like `environment(out$model$data)`, etc. – Josh O'Brien Feb 17 '14 at 18:58
  • Bingo! This returns something! How to see how big the environment is? I tried `object.size(environment(y))` but it returns 28 which is nonsense. – Tomas Feb 17 '14 at 19:07
  • Try something like `eapply(environment(y), object.size)` or, really, `sum(unlist(eapply(environment(y), object.size)))`. That'll at least get you started (unless, of course, the components of `environment(y)` are themselves environments!) – Josh O'Brien Feb 17 '14 at 19:10
  • 1
    Since `save()` and `saveRDS()` compress their output, it's also possible (although unlikely) to have pathological data that is larger compressed than uncompressed. – hadley Feb 17 '14 at 20:21