1

I am using the R segue package (downloadable from here) to carry out parallel processing. I would like to source a package to be installed when setting up clusters. The package is my own that I have made, and I have converted it into a tar.gz file but cannot work out how to get it to be installed on the cluster instances.

to make a reproducible example with an existing package I downloaded the devtools package from cran as a tar.gz file and tried to have that set up as an sourcePackageToInstall parameter

here is my example, should I be doing something different?:

require(segue)
myCluster <- createCluster(5, sourcePackagesToInstall = c('/path.to.downloads/Downloads/devtools_0.8.tar.gz'))

resulting the following output

[1] "INFO: Now building sources packages to install and uploading them based on the sourcePackagesToInstall list."
[1] "INFO: Source packages uploaded."
STARTING - 2012-11-16 18:24:28
STARTING - 2012-11-16 18:25:00
STARTING - 2012-11-16 18:25:32
STARTING - 2012-11-16 18:26:03
STARTING - 2012-11-16 18:26:35
STARTING - 2012-11-16 18:27:07
STARTING - 2012-11-16 18:27:38
STARTING - 2012-11-16 18:28:10
STARTING - 2012-11-16 18:28:42
SHUTTING_DOWN - 2012-11-16 18:29:14
SHUTTING_DOWN - 2012-11-16 18:29:46
SHUTTING_DOWN - 2012-11-16 18:30:17
SHUTTING_DOWN - 2012-11-16 18:30:50
SHUTTING_DOWN - 2012-11-16 18:31:22
SHUTTING_DOWN - 2012-11-16 18:31:53
SHUTTING_DOWN - 2012-11-16 18:32:25
SHUTTING_DOWN - 2012-11-16 18:32:57
SHUTTING_DOWN - 2012-11-16 18:33:29
SHUTTING_DOWN - 2012-11-16 18:34:01
SHUTTING_DOWN - 2012-11-16 18:34:32
SHUTTING_DOWN - 2012-11-16 18:35:04
SHUTTING_DOWN - 2012-11-16 18:35:36
SHUTTING_DOWN - 2012-11-16 18:36:08
SHUTTING_DOWN - 2012-11-16 18:36:39
SHUTTING_DOWN - 2012-11-16 18:37:11
SHUTTING_DOWN - 2012-11-16 18:37:43
SHUTTING_DOWN - 2012-11-16 18:38:14
SHUTTING_DOWN - 2012-11-16 18:38:47
SHUTTING_DOWN - 2012-11-16 18:39:18
FAILED - 2012-11-16 18:39:50

Thanks

EDIT

Trying to run a cluster from an EC2 instance to start with...this is what I did... I know that devtools is on CRAN, but the aim was to get a custom package installed on each of the instances created by the cluster...but to no avail...sorry if this is long...but thought it best to be through...

R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows"
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> system("wget -q  http://cran.r-project.org/src/contrib/devtools_0.8.tar.gz")
> system("R CMD INSTALL devtools_0.8.tar.gz")
* installing to library ‘/home/ubuntu/R/library’
* installing *source* package ‘devtools’ ...
** package ‘devtools’ successfully unpacked and MD5 sums checked
** libs
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG      -fpic  -O3 -pipe  -g  -c devtools.c -o devtools.o
gcc -std=gnu99 -shared -o devtools.so devtools.o -L/usr/lib/R/lib -lR
installing to /home/ubuntu/R/library/devtools/libs
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded

* DONE (devtools)
> require(devtools)
Loading required package: devtools
> require(segue)
Loading required package: segue
Loading required package: rJava
Loading required package: caTools
Loading required package: bitops
Segue did not find your AWS credentials. Please run the setCredentials() function.
> setCredentials("xxxxxxxxxxxxxxx", "xxxxxxxxxxxxxxxxxx")
> getwd()
[1] "/home/ubuntu"
> cl <- createCluster(2, sourcePackagesToInstall=c("/home/ubuntu/devtools_0.8.tar.gz"))
[1] "INFO: Now building sources packages to install and uploading them based on the sourcePackagesToInstall list."
[1] "INFO: Source packages uploaded."
STARTING - 2012-11-22 03:58:07
STARTING - 2012-11-22 03:58:40
STARTING - 2012-11-22 03:59:11
STARTING - 2012-11-22 03:59:43
STARTING - 2012-11-22 04:00:15
STARTING - 2012-11-22 04:00:47
BOOTSTRAPPING - 2012-11-22 04:01:19
BOOTSTRAPPING - 2012-11-22 04:01:51
BOOTSTRAPPING - 2012-11-22 04:02:23
BOOTSTRAPPING - 2012-11-22 04:02:55
BOOTSTRAPPING - 2012-11-22 04:03:26
BOOTSTRAPPING - 2012-11-22 04:03:59
BOOTSTRAPPING - 2012-11-22 04:04:30
BOOTSTRAPPING - 2012-11-22 04:05:03
BOOTSTRAPPING - 2012-11-22 04:05:34
SHUTTING_DOWN - 2012-11-22 04:06:06
SHUTTING_DOWN - 2012-11-22 04:06:38
SHUTTING_DOWN - 2012-11-22 04:07:10
SHUTTING_DOWN - 2012-11-22 04:07:41
SHUTTING_DOWN - 2012-11-22 04:08:14
SHUTTING_DOWN - 2012-11-22 04:08:45
SHUTTING_DOWN - 2012-11-22 04:09:17
SHUTTING_DOWN - 2012-11-22 04:09:49
SHUTTING_DOWN - 2012-11-22 04:10:21
SHUTTING_DOWN - 2012-11-22 04:10:53
SHUTTING_DOWN - 2012-11-22 04:11:25
SHUTTING_DOWN - 2012-11-22 04:11:56
SHUTTING_DOWN - 2012-11-22 04:12:28
SHUTTING_DOWN - 2012-11-22 04:13:00
SHUTTING_DOWN - 2012-11-22 04:13:32
SHUTTING_DOWN - 2012-11-22 04:14:04
SHUTTING_DOWN - 2012-11-22 04:14:36
SHUTTING_DOWN - 2012-11-22 04:15:07
SHUTTING_DOWN - 2012-11-22 04:15:39
SHUTTING_DOWN - 2012-11-22 04:16:11
FAILED - 2012-11-22 04:16:43
> traceback()
No traceback available 
> sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C         LC_TIME=C            LC_COLLATE=C         LC_MONETARY=C       
 [6] LC_MESSAGES=C        LC_PAPER=C           LC_NAME=C            LC_ADDRESS=C         LC_TELEPHONE=C      
[11] LC_MEASUREMENT=C     LC_IDENTIFICATION=C 

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] segue_0.05   caTools_1.13 bitops_1.0-5 rJava_0.9-3  devtools_0.8

loaded via a namespace (and not attached):
 [1] RCurl_1.95-3    digest_0.5.2    evaluate_0.4.2  httr_0.2        memoise_0.1     parallel_2.15.1 plyr_1.7.1     
 [8] stringr_0.6.1   tools_2.15.1    whisker_0.1    
> 

Any help from anybody would be greatly appreciated....

h.l.m
  • 13,015
  • 22
  • 82
  • 169

1 Answers1

0

h.l.m do you know for sure if the package will load on a Linux machine? Your first step of debugging would be to try and load the package from source on a Linux EC2 machine. If you can't load the package in Linux, it can never be loaded on the slave nodes using Segue.

Give that a test and let us know the results

JD Long
  • 59,675
  • 58
  • 202
  • 294
  • Hi, I don't think its a linux OS issue, as I just spun up a linux instance and ran the following commands on it in an R-studio sever session to check install. `system("wget http://cran.r-project.org/src/contrib/devtools_0.8.tar.gz")` `system("R CMD INSTALL devtools_0.8.tar.gz")` `require(devtools)` which works fine... – h.l.m Nov 19 '12 at 20:51