0

I would like to setup a Ray cluster to use Rtune over 4 gpus on AWS. But each gpu belongs to a different member of our team. I have scoured available resources for an answer and found nothing. Help ?

1 Answers1

0

In order to start a Ray cluster using instances that span multiple AWS accounts, you'll need to make sure that the AWS instances can communicate with each other over the relevant ports. To enable that, you will need to modify the AWS security groups for the instances (though be sure not to open up the ports to the whole world).

You can choose which ports are needed via the arguments --redis-port, --redis-shard-ports, --object-manager-port, and --node-manager-port to ray start on the head node and just --object-manager-port, and --node-manager-port on the non-head nodes. See the relevant documentation.

However, what you're trying to do sounds somewhat complex. It'd be much easier to use a single account if possible, in which case you could use the Ray autoscaler.

Robert Nishihara
  • 3,276
  • 16
  • 17