3

For security reasons, we have created our own mini CRAN repository. I would like to prevent R packages from being installed from any location other than our repository, whether this is from a tar file on their desktop or an actual CRAN repository. We have set

local({r <- getOption("repos")
       r["CRAN"] <- "http://fakecran.com/R/cran/"
       options(repos=r)})

in the RProfile.site. However, the problem is that they can still change getOption("repos") to whatever they want and they can also specify a location using install.packages(repos="http://cran.r-project.org"). Is there any way to override these methods in RProfile.site (which our users are unable to access)in such a way that they can't then be overridden again?

rawr
  • 20,481
  • 4
  • 44
  • 78
michaelp
  • 353
  • 6
  • 24
  • As long as users have internet access there is no perfect security. You could change the source of `install.packages` for every R installation and hard-code your own repository. – Roland Mar 18 '15 at 16:15
  • in what way is downloading statistics code from cran a security risk? – rawr Mar 18 '15 at 16:16
  • @rawr - it's code. It does what it wants. It can destroy you. Seriously though it can be a security risk. I can think of ways to make it *harder* for a user to download from CRAN but am having trouble thinking of ways of making it impossible. – Dason Mar 18 '15 at 16:18
  • http://tinfoilhat.stackexchange.com – rawr Mar 18 '15 at 16:20
  • 3
    Well, you could blacklist all CRAN mirrors on your proxy. – Roland Mar 18 '15 at 16:24
  • 1
    @Roland It's a start but that doesn't take care of the problem of users installing locally from a tar/zip – Dason Mar 18 '15 at 16:25
  • @rawr There are concerns because it's open source. I completely agree that the risk is low, but with some of the R packages having C in them, it's possible that something malicious could be in there. We have to scan anything that comes in. – michaelp Mar 18 '15 at 16:42
  • @Roland So we did think about going that route but someone could email something to themselves if they wanted it and then load it from their desktop with repos as well. Then you also have to keep up with the possible changes in CRAN mirrors, which is ultimately why I was trying to override – michaelp Mar 18 '15 at 16:43
  • 1
    I hear that argument a bit. "We don't trust it *because* it's open source". Makes me chuckle because the implication is that the code that you trust is the code that people aren't willing to show you :D – Dason Mar 18 '15 at 16:44
  • @Dason I agree. Malicious code can be inserted into any code and be very difficult to find. In this case though, we do pretty comprehensive scans and security checks for every third party piece of software we use. This is really just about being able to scan every package before our users use it. – michaelp Mar 18 '15 at 16:50
  • Oh I understand - I just find that argument that it's untrusted *because* it's open source to be silly. In this case it's more untrusted because there the code in packages comes with no guarantees. Like I said I can think of ways to make it harder but am having trouble thinking of a foolproof way that would make it impossible to install packages. I mean a user could conceivably even unpack the package and place it in their library without ever needing to use `install.packages` or `R CMD INSTALL`. – Dason Mar 18 '15 at 16:56
  • @Dason And I'd be pretty happy to hear about how to make it _harder_ to download from CRAN. Honestly, I just have to mitigate how I can and then other people will make the decision whether or not it's enough to use R. I would hate, as a company, not being able to use R because of something highly unlinkely. – michaelp Mar 18 '15 at 16:58
  • 2
    Not a developed idea, but `install.packages=function(pkgs, lib, repos, ...) { stopifnot(identical(repos, "http://my.repos")); utils::install.packages(pkgs, lib, repos, ...) }`; use site Rprofile / Renviron to place in front of utils on `search()`; expose R through a script that strips --vanilla, --no-site-file, etc options. Beware of `source()`, which evaluates arbitrary R code from a URL. Wait for irritated users wondering why R is broken, install their own R (not hard), etc. Alternative: run R in a docker or other container where the user can do whatever damage they want. – Martin Morgan Mar 18 '15 at 20:31
  • @MartinMorgan It took some playing around, but I went with your first option. In our environment, it's exceedingly difficult to install any software, even through command line. I also asked others if we talk about the last solution, which I think is the best, but they said that transporting potentially sensitive data outside of our work environment is a no go. As an aside, I sent my wife, who works with gene/metabolomics sequencing for MS stuff to your website. – michaelp Mar 20 '15 at 12:11

0 Answers0