6

I recently submitted a package to CRAN that passed all the automatic checks, but failed passing the manual ones. One of the errors were the following:

Please do not set a seed to a specific number within a function.

Please do not modifiy the .GlobalEnv. This is not allowed by the CRAN policies.

I believe the lines of code that these comments are referring to are the following

    if(simul == TRUE){

        set.seed(42)

    }

    w <- matrix(data = rbinom(n = p, size = 1, prob = 0.5), ncol = 1)

    beta <- w*beta-(1-w)*beta

    s <- round((1-sparsity)*p)                  

    toReplace <- sample(p, size = s)

    beta <- replace(beta, list = toReplace, values = 0)


    # Generate the random p-columned matrix of indicator series. 

    X <- matrix(data = rnorm ((n_l*m) * p, mean = mean_X, sd = sd_X), ncol = p, nrow = n_l*m)


    if(simul == TRUE){

        rm(.Random.seed, envir = globalenv())

    }

Essentially, I am allowing the function to include a simulations option "simul", such that when set to "TRUE", a matrix "X" and a vector of coefficients "beta" remain fixed. I remove the seed at the end of this segment (final lines), as the rest of the code contains variables that should change at each iteration of the simulation. However, as noted in the feedback from CRAN, this is not allowed. What is an alternative way to go about this? I cannot set a fixed vector "beta" or matrix "X" when "simul" is "TRUE", since the dimension of these are inputs to the function and thus vary depending on the preferences of the investigator.

Carl
  • 301
  • 2
  • 10
  • 1
    You should let the user set the seed, that's why CRAN dislikes this practice. – Rui Barradas Jan 07 '22 at 18:23
  • What if you make this seed as an argument of your function such that the users of the function can set to any integer as they like? – Daniel James Jan 07 '22 at 18:46
  • As above, just let the user set the seed so that they can get the same results every time. As for changing the global environment why do you need to do that? Anyways if you let `seed` be passed as an argument you don't need to "clean up" the environment. – Valeri Voev Feb 02 '22 at 14:00

3 Answers3

2

A similar question has been asked on the Bio devel mailing list. The suggestion there was to use the functionality of withr::with_seed. Your code could then become:

library(withr)

if(simul == TRUE){
  w <- with_seed(42, matrix(data = rbinom(n = p, size = 1, prob = 0.5), ncol = 1))
} else {
  w <- matrix(data = rbinom(n = p, size = 1, prob = 0.5), ncol = 1)
  
} 


beta <- w*beta-(1-w)*beta

s <- round((1-sparsity)*p)                  

toReplace <- sample(p, size = s)

beta <- replace(beta, list = toReplace, values = 0)


# Generate the random p-columned matrix of indicator series. 

X <- matrix(data = rnorm ((n_l*m) * p, mean = mean_X, sd = sd_X), ncol = p, nrow = n_l*m)

Of course that raises the question of how withr got on CRAN, given that it appears to do the same thing that you're being told not to do - the difference may be that your version may overwrite an existing seed, whereas that code checks whether a seed already exists.

Miff
  • 7,486
  • 20
  • 20
2

If you really, really, want to set the seed inside a function, which I believe you nor anyone should do, save the current seed, do whatever you want, and before exiting the function reset it to the saved value.

old_seed <- .Random.seed
rnorm(1)
#[1] -1.173346

set.seed(42)
rbinom(1, size = 1, prob = 0.5)
#[1] 0

.Random.seed <- old_seed
rnorm(1)
#[1] -1.173346

In a function it could be something like the following, without the message instructions. Note that the function prints nothing, it never calls any pseudo-RNG and always outputs TRUE. The point is to save the seed's current value and reset the seed in on.exit.

f <- function(simul = FALSE){
  if(simul){
    message("simul is TRUE")
    old_seed <- .Random.seed
    on.exit(.Random.seed <- old_seed)
    # rest of code
  } else message("simul is FALSE")
  invisible(TRUE)
}

f()
s <- .Random.seed
f(TRUE)
identical(s, .Random.seed)
#[1] TRUE

rm(s)
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
0

When you fix the seed, if the user try this code with the same parameters, the same results will be obtained each time.

Supposing that this chunk of code is inside a larger chunk related only to the simulation, just get rid of the setseed() and try something like that:

if(simul == TRUE){
    w <- matrix(data = rbinom(n = p, size = 1, prob = 0.5), ncol = 1)
    beta <- w*beta-(1-w)*beta
    s <- round((1-sparsity)*p)                  
    toReplace <- sample(p, size = s)
    beta <- replace(beta, list = toReplace, values = 0)

    # Generate the random p-columned matrix of indicator series. 
    X <- matrix(data = rnorm ((n_l*m) * p, mean = mean_X, sd = sd_X), ncol = p, nrow = n_l*m)
}
Tur
  • 604
  • 4
  • 9