4

I'm trying to build a snow cluster with around 120 processes on 3 different hosts. These are AMD servers with 48 cores each. After building approx the first 90 slaves I get this error:

cl = makeSOCKcluster(c(rep("localhost", 44), rep("host2", 46), rep("host3", 45)))
Error in socketConnection(port = port, server = TRUE, blocking = TRUE,  : 
  all connections are in use
> traceback()
3: socketConnection(port = port, server = TRUE, blocking = TRUE, 
       open = "a+b")
2: newSOCKnode(names[[i]], options = options, rank = i)
1: makeSOCKcluster(c(rep("localhost", 44), rep("host2", 46), 
       rep("host3", 45)))

I checked my system limits and don't see any problem:

# cat /proc/sys/fs/file-max
12897622
# grep "#define __FD_SETSIZE" /usr/include/*.h /usr/include/*/*.h
/usr/include/linux/posix_types.h:#define __FD_SETSIZE   1024
# ulimit -a |grep open
open files                      (-n) 65536

Is there a limit on the number of processes that snow can create?

Robert Kubrick
  • 8,413
  • 13
  • 59
  • 91

1 Answers1

4

Yes, but only because there is a limit on the total number of connections R can create (currently 128). This includes more than just socket connections, so that's why you can only get to ~90 worker nodes.

> grep "define NCONNECTIONS" *
connections.c:#define NCONNECTIONS 128 /* snow needs one per slave node */

Since you're using GNU/Linux, I would suggest using multicore instead of snow.

Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
  • 1
    Using multicore wouldn't allow him to use 3 hosts. But he could use an MPI cluster with snow. – Steve Weston May 09 '13 at 12:51
  • Another alternative is to use one worker per node with snow, and then use multicore on each worker. That's more work, but scales much better. – Steve Weston May 09 '13 at 12:56
  • @SteveWeston Doesn't R need to open a connection for each MPI worker anyway? Or maybe it only communicates with the MPI master process? – Robert Kubrick May 09 '13 at 13:29
  • @RobertKubrick: snow uses Rmpi to create MPI clusters, and Rmpi uses an MPI distribution such as Open MPI, and so is not subject to the limitation of R connections. MPI has been used with thousands of workers. – Steve Weston May 09 '13 at 13:41
  • @SteveWeston ok, sounds like the way to go. Now when I try to install Rpmi I get 'package ‘Rmpi’ is not available (for R version 2.15.1)' through the USA (OH) CRAN mirror. – Robert Kubrick May 09 '13 at 13:45
  • @RobertKubrick: There aren't any binary distributions of Rmpi available on CRAN, but there are source distributions. However, it's often not trivial to install from source, which is why many people use SOCK clusters in snow. – Steve Weston May 09 '13 at 13:51