2

I've found that compiling Stan models is kind of slow, and I'm often recompiling the same model "to avoid crashing R session". It seems like ccache would be a great solution here -- I could cache the result of compilation, can R could reload the compiled object as necessary. However, ccache isn't able to return the cached results because rstan::stan_model is creating temporary C++ files with different names. Is there any way to have stan_model use the same C++ filename? Or is there a better way to cache compilation? This comment in the rstan code makes it seem like caching could be possible.

model_code <- brms::make_stancode(
  count ~ zAge + zBase * Trt + (1 | patient), 
  data = brms::epilepsy, family = "poisson"
)

cpp1 <- rstan::stanc(model_code=model_code, model_name="my_model", obfuscate_model_name=FALSE)
cpp2 <- rstan::stanc(model_code=model_code, model_name="my_model", obfuscate_model_name=FALSE)
# The content of the C++ code is identical
identical(cpp1, cpp2)
#> TRUE

m1 <- rstan::stan_model(stanc_ret=cpp1, auto_write=FALSE, save_dso=FALSE, verbose=TRUE)
#> Compilation argument:
#> /usr/lib/R/bin/R CMD SHLIB file1b4dc0286e.cpp 2> file1b4dc0286e.cpp.err.txt 

m2 <- rstan::stan_model(stanc_ret=cpp1, auto_write=FALSE, save_dso=FALSE, verbose=TRUE)
#> Compilation argument:
#> /usr/lib/R/bin/R CMD SHLIB file1b4d61eb0c7.cpp 2> file1b4d61eb0c7.cpp.err.txt 

System details: Ubuntu, R 4.0, gcc 9.3.0, ccache 3.7.7. ccache has its default settings, except for hash_dir=false and compression=true.

Footnote: Why not use auto_write=TRUE? It's possible I'm doing things wrong, but in the models I've run this doesn't prevent recompilation.

karldw
  • 361
  • 3
  • 12
  • 1
    The filename in the temporary directory is derived from the md5sum of the Stan program. So, that name is only changing when the Stan program changes. The `rstan_options(auto_write = TRUE)` mechanism is intended to handle this situation without ccache, although it is possible that you have .rds files created by an older version of Stan that makes it think that it always needs to recompile the C++ that Stan generates. – Ben Goodrich May 09 '20 at 21:51
  • I see. It seems like the brms model _is_ getting saved when I set `auto_write=TRUE`, so I am able to avoid recompiling if I run the same model again in the same session. However, it looks like the `.rds` file is saved to the temporary directory (maybe because brms passes the stan model as a string rather than writing out a file), so it's not available in a different R session. I also noticed that when I pass the the family as e.g. `brms::hurdle_poisson()` (a brmsfamily) rather than a string, I consistently get "recompiling to avoid crashing R". Is this worth reporting in github? – karldw May 11 '20 at 03:15

0 Answers0