2

I want to initialize libraries in cluster by their names represented as strings.

This code works fine:

library(snowfall, rlecuyer, rsprng)
sfInit(parallel = TRUE, cpus = 4, type = "SOCK")
sfClusterEval(library(e1071))

And this code produces en error: 4 nodes produced errors; first error: object 'expr' not found

library(snowfall, rlecuyer, rsprng)
sfInit(parallel = TRUE, cpus = 4, type = "SOCK")
lib <- "e1071"
expr <- parse(text=paste("library(", lib, ")", sep=""))
sfClusterEval(expr)

So sfClusterEval try to evaluate expr and not an expression which expr contains. I cannot undestand which type of expression should be passed to sfClusterEval function, which uses substitute in its body

> sfClusterEval
function (expr, stopOnError = TRUE) 
{
    sfCheck()
    if (sfParallel()) {
        return(sfClusterCall(eval, substitute(expr), env = globalenv(), 
            stopOnError = stopOnError))
    }
    else {
        return(eval(expr, envir = globalenv(), enclos = parent.frame()))
    }
}

This question seems simple, but I could not solve it and need someone's advice.

UPDATE:

Further investigation details on simplier examples. I feel that the truth is near. This code works fine

sfClusterEval(library("e1071"))

But this call produces en error: 4 nodes produced errors; first error: object 'lib' not found

lib <- "e1071"
sfClusterEval(library(lib, character.only=TRUE))

ANSWER:

The variable lib should be exported to the cluster previously. And after that it can be removed.

lib <- "e1071"
sfExport("lib")
sfClusterEval(library(lib, character.only=TRUE))
sfRemove("lib")

Thanks for Richie, for giving the starting idea!

DrDom
  • 4,033
  • 1
  • 21
  • 23
  • 1
    Why don't you use `sfLibrary` to load the packages into workers? – Roman Luštrik Feb 03 '12 at 13:45
  • Excelent advice! I missed this feature of the snowfall package. Shame on me. :) If you will post this comment as a answer, I'll be able to mark it as a solution. – DrDom Feb 05 '12 at 06:11

2 Answers2

2

You can use sfLibrary to load extra packages on workers. See ?snowfall and click snowfall-tools.

Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197
1

Whether in a cluster or not, you simply use the character.only argument to library.

library("e1071", character.only = TRUE)

If your nodes report an error stating that they can't find the package, double check that the package is installed on that machine, in a location that is one of .libPaths(). If all else fails, explicitly provide the location of the package in the lib.loc argument to library.

Richie Cotton
  • 118,240
  • 47
  • 247
  • 360
  • Hm... It seems that such approach doesn't work in this situation. See the update to the answer. Where am I wrong? – DrDom Feb 03 '12 at 11:03
  • @DrDom: Updated with how to find the package. – Richie Cotton Feb 03 '12 at 11:08
  • It was even simplier, variable `lib` was absent in the cluster nodes and thus it was invisble to them. And as I understand evaluation of the function performs on each node separately. Yes, it's logical, I know. But nevertheless, many thanks to you for giving me a right direction! – DrDom Feb 03 '12 at 11:15
  • Glad you've solved it. If the answer was useful to you, remember to click the up arrow to vote. – Richie Cotton Feb 03 '12 at 13:31
  • You might also consider moving your solution to it's own answer (you can answer your own question), then clicking the tick to mark it as solved. – Richie Cotton Feb 03 '12 at 13:32