0

I have a standard data analysis procedure that I need to run on various (~50 datasets). I have been developing it for some time and now I got to the point where I would like to turn it into a function which takes a dataset and spits out some sensible table for each dataset. However, the procedure as done spans over four script files and so far I have used source from one to another to run it, but it seems it seems to be impossible with function.

I have a following problem:

foo <- function(data) {
  a <- somevariable
  source("..somefile..") #The code in there uses a, but a is not in the workspace...
  ..
  continue
  ..
}

The code crashes when you run it on a dataset.

Is there some way (command) that would just copy-paste the commands from other files while compiling (don't know how I could call it differently, even though it is not real compiling) the function? I know I can just copy-paste it myself, but I would rather not, because various steps include neural-networks and ARFIMA estimations which I would like to keep in separate files for the sake of readability of the code. Anyway the function would after copy-paste be something like 200 lines of code, which is definitely not user friendly...

Thx

Bryan Hanson
  • 6,055
  • 4
  • 41
  • 78
krhlk
  • 1,574
  • 2
  • 15
  • 27
  • Why not make the files you currently `source` the arguments to the function? It does sound like you are close to the point where you need to start formalizing things as functions that do specific tasks, and perhaps build a small package out of it (or at least source all your functions and then use them in other functions). – Bryan Hanson Feb 20 '13 at 14:41
  • Nope, it is just one-purpose code, making package is not useful, I already made a pretty big package to do this analysis. ;) Does a function that only includes some text on compilation exist in R? – krhlk Feb 20 '13 at 14:47
  • I'm not completely sure what you mean by 'only includes some text on compilation'. There are certainly functions that include blocks of text to pass to another function, or that write blocks of text to files, like html files. – Bryan Hanson Feb 20 '13 at 14:52
  • 1
    Ah, you have a scoping problem more than anything. The `..some file..` needs to be written as a function so that arguments can be substituted and don't have to have identical names in all contexts. Clearly the environment of the sourced function is not the same as the calling function, or you wouldn't get the error that you do. Fixing the environment issue is going to be more work and more error prone than just biting the bullet and re-writing the source scripts as functions. – Bryan Hanson Feb 20 '13 at 15:01
  • I guess another option would be to put it all in a `Rnw` file and use `sweave` or `knitr` to do your analysis (no sourcing, just include the full code within the code chunks). This would have the advantage of being able to write out nice reports and graphs if that is important. – Bryan Hanson Feb 20 '13 at 15:09
  • So in other words such a function does not exist? – krhlk Feb 20 '13 at 15:45
  • If I understand what you want to happen, I don't think such a function exists or even could exist (how could a generic function read a random piece of code and get it right w/o some kind of table that maps the names of things in each environment? Which, by the way, is the exact purpose of using functions within functions). You might be interested in this article on the design of the R language. Every time I re-read it (the first part, the 2nd is too hard for me!) I gain a new appreciation. See r.cs.purdue.edu/pub/ecoop12.pdf – Bryan Hanson Feb 20 '13 at 16:42
  • I'd suggest wrapping the code in "somefile" in a function with the data that would be in "a" as an argument. – seandavi Feb 20 '13 at 17:02

3 Answers3

1

I'd suggest starting with a minimal example so you get a sense of what writing a function entails, how to load the function into R using source(), how to use arguments, and how to call the function. After doing that, it will hopefully be more evident where to go next.

To answer your question, if your script includes 200 lines of code and does only one thing (that is, it does one FUNCTION), you should be thinking about wrapping that into a function, yes. This will actually increase user friendliness rather than decrease it since your scripts can now include only one line (the function call) rather than the original 200 lines of code.

seandavi
  • 2,818
  • 4
  • 25
  • 52
  • Yes, I could rewrite it as a function, but it would take a lot of time and I was wondering whether a command that only includes texts exists? – krhlk Feb 20 '13 at 14:49
  • I don't think it will take a lot of time once you know how and that is a skill you WANT to learn :), but source() is what you would use to run code from another file inside your function if you think that is somehow easier. – seandavi Feb 20 '13 at 16:15
0

I'm guessing here, but something like this:

myfunc <- function() {
if (something) source("path to script")
} else {
if (another thing) source("path to another script")
}
Do calcs and return the result
}

Or the paths to the scripts could be function arguments.

Bryan Hanson
  • 6,055
  • 4
  • 41
  • 78
0

I understand your question, I assume that you want to use a proc inside a function using the function source, so what you only need is add the paramter local = T

Example

#This is your source code
y<-x+1
#save in your favorite path as "your_path.R"

#Function where you will call the above code
your_function<-function(x){
source("your_path.R",local=T) #plus 1 to x. IMPORTANT: local=T
return(y)
}
Community
  • 1
  • 1
Henry Navarro
  • 943
  • 8
  • 34