I am preparing an R package for submission to CRAN that includes two demos and a vignette (that essentially explains the two demos). One demo runs relatively quickly, the other takes quite some time (more than 30 minutes). To speed up the demo and vignette I created an Rda file that contains the results of the function that takes a long time to run. Specifically, here is the code:
matchfile <- system.file("doc/tmatch.nmes.rda", package="TriMatch")
if(file.exists(matchfile)) {
load(matchfile)
} else {
trimatch <- cmpfun(trimatch)
tmatch.smoke <- trimatch(tpsa.smoke, exact=nmes[,c("LastAge5","MALE","RACE3")])
tmatch.packyears <- trimatch(tpsa.packyears, exact=nmes[,c("LastAge5","MALE","RACE3")])
save(tmatch.smoke, tmatch.packyears, file=matchfile)
tools::resaveRdaFiles(matchfile)
}
I probably could get rid of the file.exists
check since I already ran the code and include the tmatch.nmes.rda
file in the package, but I want to show how that file was created. This has worked great for me locally. The vignette now builds in a reasonable amount of time and the demo also runs fast. The tmatch.nmes.rda
files is about 12mb even after compressing it with resaveRdaFiles
. The problem now is that R CMD check
gives me a NOTE about the size of the data file. Before pleading my case to the CRAN maintainers, I am seeking advise here. Here are my questions:
Can I prevent
R CMD check
from running the code in the vignette source? I know I can locally, but how I do I prevent that from happening once I submit to CRAN? If I can do this I can just omit the data file.Is it a bad idea to include a pre-processed data file to speed up the vignette and demo?
The package for which I am working with is hosted here: https://github.com/jbryer/TriMatch