5

I want to execute a script file.R using Rscript. In file.R, I use the package dplyr.

# file.R
df <- data.frame(ID,x,y,z,...)
library(dplyr)
filter(df, ID != "null")
......

If I don't specify any options in the batch-file, everything works fine as file.R includes the line library(dplyr)

# 1) no specification of packages in the batch file  
Rscript.exe file.R arg1 arg2 arg3 > outputFile.Rout 2>&1

However, if I add default-packages=utils in the batch file,

# 2) specification of packages utils in the batch file
Rscript.exe  default-packages=utils file.R arg1 arg2 arg3 > outputFile.Rout 2>&1

the part of file.R using dplyrdoesn't work anymore (Error in filter(df, ID != 'null') : Object 'ID' could not be found)

Since ?Rscript says

--default-packages=list
where list is a comma-separated list of package names or NULL

I tried adding --default-packages=utils,dplyr,

# 3) specification of packages utils and dplyr in the batch file
Rscript.exe  default-packages=utils,dplyr file.R arg1 arg2 arg3 > outputFile.Rout 2>&1

which causes the same error as in 2

Why is batch file 1 the only one that works? I am calling the same R script in all 3 alternatives.

rmuc8
  • 2,869
  • 7
  • 27
  • 36
  • Is there a reason you feel the need to specify the packages at the command-line as opposed to just putting the respective `library` at the start of your script? – cdeterman Apr 14 '15 at 12:13
  • The thing is, that putting `library` at the start of the script is not the solution, and that's the reason for my posting. In all 3 alternatives, I called the same script `file.R`, which included the library commands at the start. – rmuc8 Apr 14 '15 at 12:25
  • I thought you said that option 1 works where you don't specify any options? Am I misunderstanding your statement? Your problem is not quite clear. – cdeterman Apr 14 '15 at 12:28
  • Is it clearer now after the edit? If not, could you specify what is unclear? Thx for your feedback anyway. – rmuc8 Apr 14 '15 at 12:35
  • @Dason No, I've tried that before. I also read that there was an issue with chaining operators and Rscript in the past, but even the removal of chaining operators does not make it work. – rmuc8 Apr 14 '15 at 12:43
  • You could always try `r` from [littler](http://dirk.eddelbuettel.com/code/littler.html). I use it all the with the `-lpkg1,pkg2,pkg3` switch to load several required packages. It also load `methods` by default -- and still starts faster than `Rscript`. – Dirk Eddelbuettel Apr 14 '15 at 12:51
  • I'll have a look at it, thx Dirk! – rmuc8 Apr 14 '15 at 13:01
  • you say "The thing is, that putting library at the start of the script is not the solution". Why isn't it? Is it because the package does not exist on .libPaths() from within RScript? – mpag Dec 21 '18 at 22:20

3 Answers3

8

The --default-packages parameter specifies the packages you want to load by default. It doesn't add to the list of default packages - it replaces the list. Which means that you need to specify all the other base packages you are relying on as well. You can see this by making a simple test script that calls sessionInfo()

In file "env.R":

sessionInfo()

Call from the terminal: Rscript env.R

R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  base 

Now I modify that call: Rscript --default-packages=utils env.R

R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] utils base 

So you do need to specify the other missing packages.

RScript --default-packages=stats,graphics,grDevices,utils,datasets,base,methods env.R

and I threw methods in there too.

With that said if you weren't having any issues when you just ran it with RScript I don't understand why you're trying to mess with the default-packages argument. It seems like you're just creating problems for yourself unless there are other issues you are trying to solve that you're not telling us.

Dason
  • 60,663
  • 9
  • 131
  • 148
  • _it just creates problems_ because, as @Dason says, you are using it in the wrong way. "Default packages" is not the same as the "these default packages PLUS a few I need". – Dirk Eddelbuettel Apr 14 '15 at 14:05
4

For completeness and to illustrate my comment, Charles's example works for me on one line using littler. I am line-breaking it here just for the exposition:

edd@max:~$ r -ldplyr -e'iris %>% \
                        group_by(Species) %>% \         
                        summarise(mean(Sepal.Length)) %>% \
                        print'
Source: local data frame [3 x 2]

     Species mean(Sepal.Length)
1     setosa              5.006
2 versicolor              5.936
3  virginica              6.588
edd@max:~$ 

As I said, that really is one line (and one difference is that r wants to explicitly print).

But as you can see, the datasets package is also automagically loaded by r.

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
0

Can you try the following test? I can't fit this in the comments. This runs fine on my system.

test.R

library(dplyr)

data(iris)

iris %>%
group_by(Species) %>%
summarise(mean(Sepal.Length))

In your terminal:

Rscript --default-packages=utils,datasets,dplyr test.R

cdeterman
  • 19,630
  • 7
  • 76
  • 100