1

I come from python, where import behaves in a more namespaced style, and I have little background with R.

I am trying to develop an R application that is split into separate entities, but as far as I understand, R has no import as in python. From what I gathered:

  • library is to import installed libraries, which have their own namespaces so the risk of conflict in the importing .R can be mitigated with include.only.
  • if you have code that is part of your application and not in an external library, you have to use source. From what I gathered, source basically is equivalent of slapping the whole content of the sourced.R file into the sourcing.R file.

I am discussing the second case here. not the first. I suspect that what happens in this case is that if you have multiple sourced.R that have the same symbols, they will conflict silently. From the python point of view, it is pretty much like import *.

Here are the questions:

  1. am I correct in saying that the functions defined in sourced.R will all go in the global environment, not in their own environment?
  2. what happens if you source something twice? does it get included twice?
  3. is there a technical or best practice solution to prevent accidental conflicts from sourced modules that happen to have the same symbol names?

Edit

This is an example after G. Grothendieck suggestion:

ex1.R

cat("hello")
source("whatever.R")
source("whatever.R", local=whatever <- new.env())
x()
whatever$x()
cat("whatever")
print(environment())
x <- function() {
    print("x")
}

So, in principle, one could use this strategy to ensure that functions are not shoved all in the global namespace and conflict. However, it becomes a responsibility of the importing code, and additionally if any state is maintained and two pieces of code source the same module, they will end up with different environments and thus different state.

Bottom line is environments must be stateless.

Stefano Borini
  • 138,652
  • 96
  • 297
  • 431

1 Answers1

0

To your questions:

  1. Objects created in a sourced script will be available in the environment from which the script was sourced. So if the script was sourced from the global environment, then the objects created in the sourced script will be available in the global environment
  2. If you source something twice, it gets included once. Just as though you ran x <- 4; x <- 4. In that case x would only be included once
  3. @G. Grothendieck suggests seeing the local argument. You may also find this answer useful (shows how global assignment works, if required). Hadley's write up on environments is comprehensive.

In case it's useful, we can create the following 2 files. Running just the code in first.R shows us

  • a becomes 2, which shows that assignment in a sourced script will indeed override any name in the env from which it was sourced (this applies to objects/functions alike)
  • c equals 8 which confirms that a sourced script 'sees' the object b from the env from which it was sourced
# first.R
# rm(list=ls())
a <- 1
b <- 4
source("second.R", local = new.env())
# second.R
a <- 2 
new_func <- function(x) { x * 2 } 
c <- new_func(b) 
print(a)
print(c) 
stevec
  • 41,291
  • 27
  • 223
  • 311
  • > "If you source something twice, it gets included once". Not from my brief evaluation. If you have a cat() or print(), you get two messages, which means that every time you need to use something that lives in another .R file, you have either to rely that the given code has already been sourced and therefore lives in the global namespace (BAD!) or source it everywhere and have a double inclusion guard like in C. – Stefano Borini Aug 08 '19 at 13:30
  • @StefanoBorini I agree with `print()` or `cat()`. But I believe any latter assignments (functions or other objects) will simply override existing objects of the same name? – stevec Aug 08 '19 at 13:33
  • Yes but my point is that if it behaves like that, you might end up with the following problems: 1. code that has a cyclic dependency will import itself forever until it runs out of memory 2. side effects are produced multiple times 3. if two modules happen to have the same named routine now order of import matters, which is a nightmare in a large application – Stefano Borini Aug 08 '19 at 13:40
  • I think I found a simple solution. Please check the updated example above. `source("second.R", local = new.env())` creates a new environment to run the sourced code. Now we see `a` equals `2` in the environment in which `second.R` runs (since `cat()` prints this for us`, but we also see that `a` is unchanged in the parent environment (i.e. it still equals 1) – stevec Aug 08 '19 at 13:41