How can I limit the scope of functions using source()?

Question

I come from python, where import behaves in a more namespaced style, and I have little background with R.

I am trying to develop an R application that is split into separate entities, but as far as I understand, R has no import as in python. From what I gathered:

library is to import installed libraries, which have their own namespaces so the risk of conflict in the importing .R can be mitigated with include.only.
if you have code that is part of your application and not in an external library, you have to use source. From what I gathered, source basically is equivalent of slapping the whole content of the sourced.R file into the sourcing.R file.

I am discussing the second case here. not the first. I suspect that what happens in this case is that if you have multiple sourced.R that have the same symbols, they will conflict silently. From the python point of view, it is pretty much like import *.

Here are the questions:

am I correct in saying that the functions defined in sourced.R will all go in the global environment, not in their own environment?
what happens if you source something twice? does it get included twice?
is there a technical or best practice solution to prevent accidental conflicts from sourced modules that happen to have the same symbol names?

Edit

This is an example after G. Grothendieck suggestion:

ex1.R

cat("hello")
source("whatever.R")
source("whatever.R", local=whatever <- new.env())
x()
whatever$x()

cat("whatever")
print(environment())
x <- function() {
    print("x")
}

So, in principle, one could use this strategy to ensure that functions are not shoved all in the global namespace and conflict. However, it becomes a responsibility of the importing code, and additionally if any state is maintained and two pieces of code source the same module, they will end up with different environments and thus different state.

Bottom line is environments must be stateless.

You can source into an environment. See the `local` argument of `source`. Also see the modules package on CRAN. — G. Grothendieck, Aug 08 '19 at 12:59
@G.Grothendieck it's not really clear to me what is meant with local in that context. Does the sourced.R file have its own separate environment in that case? — Stefano Borini, Aug 08 '19 at 13:17
Yes. Try `cat("a <- 3", file = "testa.R"); source("testa.R", local = e <- new.env()); ls(e)` — G. Grothendieck, Aug 08 '19 at 13:42
@G.Grothendieck After experimenting, it won't bind it to a different environment. local is only valid if you are sourcing from e.g. inside a function. If you don't include local, source will bind the symbols to the global env in any case. Only if you include local=TRUE the symbols will be bound to the function's environment — Stefano Borini, Aug 08 '19 at 13:42
@G.Grothendieck ah, that's interesting. it's not only a logical value? — Stefano Borini, Aug 08 '19 at 13:43
@G.Grothendieck you must agree it's not very well written that it's an option and what it does. — Stefano Borini, Aug 08 '19 at 13:51
The help files do tend to be terse and need to be read closely. — G. Grothendieck, Aug 08 '19 at 13:53

stevec · Answer 1 · 2019-08-08T13:41:04.230

0

To your questions:

Objects created in a sourced script will be available in the environment from which the script was sourced. So if the script was sourced from the global environment, then the objects created in the sourced script will be available in the global environment
If you source something twice, it gets included once. Just as though you ran x <- 4; x <- 4. In that case x would only be included once
@G. Grothendieck suggests seeing the local argument. You may also find this answer useful (shows how global assignment works, if required). Hadley's write up on environments is comprehensive.

In case it's useful, we can create the following 2 files. Running just the code in first.R shows us

a becomes 2, which shows that assignment in a sourced script will indeed override any name in the env from which it was sourced (this applies to objects/functions alike)
c equals 8 which confirms that a sourced script 'sees' the object b from the env from which it was sourced

# first.R
# rm(list=ls())
a <- 1
b <- 4
source("second.R", local = new.env())

# second.R
a <- 2 
new_func <- function(x) { x * 2 } 
c <- new_func(b) 
print(a)
print(c)

edited Aug 08 '19 at 13:41

answered Aug 08 '19 at 13:26

stevec

41,291
27
223
311

> "If you source something twice, it gets included once". Not from my brief evaluation. If you have a cat() or print(), you get two messages, which means that every time you need to use something that lives in another .R file, you have either to rely that the given code has already been sourced and therefore lives in the global namespace (BAD!) or source it everywhere and have a double inclusion guard like in C. – Stefano Borini Aug 08 '19 at 13:30
@StefanoBorini I agree with `print()` or `cat()`. But I believe any latter assignments (functions or other objects) will simply override existing objects of the same name? – stevec Aug 08 '19 at 13:33
Yes but my point is that if it behaves like that, you might end up with the following problems: 1. code that has a cyclic dependency will import itself forever until it runs out of memory 2. side effects are produced multiple times 3. if two modules happen to have the same named routine now order of import matters, which is a nightmare in a large application – Stefano Borini Aug 08 '19 at 13:40
I think I found a simple solution. Please check the updated example above. `source("second.R", local = new.env())` creates a new environment to run the sourced code. Now we see `a` equals `2` in the environment in which `second.R` runs (since `cat()` prints this for us`, but we also see that `a` is unchanged in the parent environment (i.e. it still equals 1) – stevec Aug 08 '19 at 13:41

How can I limit the scope of functions using source()?

1 Answers1