11

I'm trying to include a (somewhat) large dataset in an R package. I keep getting the Warning during the check in Rstudio saying that I could save space with compression:

* checking data for ASCII and uncompressed saves ... WARNING

  Note: significantly better compression could be obtained
        by using R CMD build --resave-data
          old_size new_size compress
  slp.rda    499Kb    310Kb    bzip2
  sst.rda    1.3Mb    977Kb       xz

I've tried adding -- resave-data to RStudio's "Configure Buid Tools" to no effect.

enter image description here

Marc in the box
  • 11,769
  • 4
  • 47
  • 97

2 Answers2

11

Another alternative, if you have a large dataset that you don't want to re-create, is to use tools::resaveRdaFiles from within R. Point it at the dataset file, or the entire data directory, and it will compress your data in a format of your choosing. See its manual page for more information.

Martin Smith
  • 3,687
  • 1
  • 24
  • 51
10

The devtools function use_data takes a parameter for the type of compression and makes adding data to pkgs much easier in general. Using it, or just save on your own), use xz compression when you save your data (for save it's the compression_level parameter).

If you want to use --resave-data then you can try --resave-data=best since just using --resave-data defaults to gzip (gaining you pretty much nothing in this case).

See Building package tarballs for more information.

hrbrmstr
  • 77,368
  • 11
  • 139
  • 205
  • 2
    Thanks for your answer - I have tried `save` with compression. The compression error is now gone, but now I get the warning: `Warning: package needs dependence on R (>= 2.10)`. Any experience with that? – Marc in the box Sep 16 '15 at 10:49
  • 4
    That's due to the extra compression. Add `R (>= 2.10)` to your `DESCRIPTION` file. – hrbrmstr Sep 16 '15 at 10:55