How to include a closure in an R-package?

Question

I would like to include a closure with the functions of an R package we are writing. The function (and its siblings) will have data in its environment, perform a comparison of input with the data, and return the result. To illustrate, think of a function with an inbuilt telephone directory: you query with a number and the function returns a name.

This function will be called as a helper by several other functions in our R package, so it has to exist once the package is loaded. And we want the function to be available in the package environment, just like any other function.

Should I create it via its factory function in .onLoad() and assign() it to the package environment? Could I ship it as an .RDS? Or RData, or does this violate CRAN policy on "binary executable code"? Or is there a different, canonical way? And where would the code and the data (or the RDS/RData) go in the package directory structure?

(I see that the question of how to document a closure has been discussed here).

alistaire - this closure **is not expected to be changed**. Thus creating it with onLoad doesn't seem wrong - no? — hyginn, Mar 25 '17 at 23:55
@alistaire - "_R still won't let you assign it to the package namespace_" I might misunderstand the ns-hooks documentation - but I thought it means this is what `.onLoad()`, `.onAttach()` are for: `.onLoad()` to take care of things that need to happen before the namespace is sealed, `.onAttach()` for things that need to happen before the environment is sealed. — hyginn, Mar 26 '17 at 03:36
the question really covers only a special case of the question title: How to inlude a closure in an R-package, *if and only if* the factory is called while loading. Anyone has a solution if the factory is called by the user _after the package is loaded_, and produces/modifies a function _inside_ the package? — Ma Ba, Feb 14 '18 at 15:57
I have just rejected @Ma Ba's thoughtful edit of my self-response. I think their approach is useful, but I can really not claim credit to it in my response (and it is not directly related to the original questions as stated). So: please add your suggestion as your own response. — hyginn, Feb 23 '18 at 16:31
thanks @hyginn; agreed. I added it as an answer as this is the first thing popping up when searching for the problem, and couldn't find a solution anywhere else — Ma Ba, Feb 24 '18 at 18:37

hyginn · Accepted Answer · 2018-02-23T16:28:41.557

For the benefit of anyone stumbling on this question. The solution I finally worked out involved a few steps but is "clean" as far as I can tell.

Put the factory function in a file R/aaa.R to ensure it gets loaded before the closure.
Put the data that the closure uses into the standard inst/extdata/ folder.
Put a file with the closure's name and proper docstring into R/: define the closure as a normal function that just returns nothing. This is necessary so the function is properly exported and known in the package namespace. Immediately call the factory function to create the closure and overwrite the original definition. Note: it's not enough to just bring the data into the factory function as an argument, it actually needs to be accessed before defining the closure. Why? That's because lazy loading won't actually have loaded the data into the environment you need it in unless you access it.

That's all. Summary: create a stub for your closure, then overwrite that with the return value of the factory function.

Ma Ba · Answer 2 · 2018-11-13T01:00:13.850

If the factory function is called later by the package user

but we still want the returned closure to be inside the package (for example if we don't want it to be changed by anything other than the factory, reliably accessible from within the package, documented etc..):

# exported function (visible to user)
# everything this function does is 'outsourced'
# to a non-exported function that we can overwrite with the factory:

visible_function(...){
   hidden_function(...)
}
# not exported function (invisible to the user)
# called by the visible function
# fails unless factory is called first
hidden_function(x){
 stop("call factory_fun() before you can use visible_function()")
}

# exported function, visible to the user.
# changes the hidden function called by the visible function
factory_function(x){
  produced_function<-function(){
     print(paste(x, "is an object forever stored in my namespace!"))
  }
   assignInNamespace("hidden_function",
                     produced_function,
                     ns="myPackageName")
}

Note that R CMD check throws a NOTE on assignInNamespace so CRAN won't easily accept this solution

How to include a closure in an R-package?

2 Answers2

If the factory function is called later by the package user