Suppose I want to write a function in R
which is a function of a couple of sufficient statistics on some data. For example, suppose the function, call it foo.func
depends only on the sample mean of a sample of data. For convenience, I think users might like to pass to foo.func
the sample of random variables (in which case foo.func
computes the sample mean), or the sample mean itself, which is all that foo.func
needs. For reasons of efficiency, the latter is preferred if there are multiple functions like foo.func
being called which can take the sample mean. In that case the mean need only be computed once (in the real problem I have, the sample statistics in question might be computationally intensive).
In summary, I would like to write foo.func
to be accessible to the beginner (pass in the data, let the function compute the sufficient statistics) as well as the expert (precompute the sufficient statistics for efficiency and pass them in). What are the recommended practices for this? Do I have a logical flag passed in? Multiple arguments? Some ways to do it might be:
#optional arguments
foo.func <- function(xdata, suff.stats=NULL) {
if (is.null(suff.stats)) {
suff.stats <- compute.suff.stats(x)
}
#now operate on suff.stats
}
or
#flag input
foo.func <- function(data.or.stat, gave.data=TRUE) {
if (gave.data) {
data.or.stat <- compute.suff.stats(data.or.stat)
}
#now operate on data.or.stat
}
I am leaning towards the former, I think