0

A lot of ray commands require a CLUSTER_CONFIG file.

for example

Usage: ray get-head-ip [OPTIONS] CLUSTER_CONFIG_FILE

Options:
  -n, --cluster-name TEXT  Override the configured cluster name.
  --help                   Show this message and exit.

The example files provided are big and scary.. like..

cluster_name: default
min_workers: 0
max_workers: 0
docker:
    image: ""
    container_name: ""
target_utilization_fraction: 0.8
idle_timeout_minutes: 5
provider:
    type: local
    head_ip: YOUR_HEAD_NODE_HOSTNAME
    worker_ips: []
auth:
    ssh_user: YOUR_USERNAME
    ssh_private_key: ~/.ssh/id_rsa
head_node: {}
worker_nodes: {}
file_mounts:
     "/tmp/ray_sha": "/YOUR/LOCAL/RAY/REPO/.git/refs/heads/YOUR_BRANCH"
setup_commands: []
head_setup_commands: []
worker_setup_commands: []
setup_commands:
    - source activate ray && test -e ray || git clone https://github.com/YOUR_GITHUB/ray.git
    - source activate ray && cd ray && git fetch && git reset --hard `cat /tmp/ray_sha`
#    - source activate ray && cd ray/python && pip install -e .
head_start_ray_commands:
    - source activate ray && ray stop
    - source activate ray && ulimit -c unlimited && ray start --head --redis-port=6379 --autoscaling-config=~/ray_bootstrap_config.yaml
worker_start_ray_commands:
    - source activate ray && ray stop
    - source activate ray && ray start --redis-address=$RAY_HEAD_IP:6379

Say I already have a ray cluster up and running, and just want to do things like, submit a job to it using ray command line. Do I really need all that stuff, or is there a minimal config I can use.

Duane
  • 4,572
  • 6
  • 32
  • 33
  • Here's a minimal example https://github.com/ray-project/ray/blob/master/python/ray/autoscaler/aws/example-minimal.yaml. In the more verbose examples, the defaults should be good so you shouldn't need to change very much. Also, if you *already* have a Ray cluster running and you started it with the autoscaler, you can submit jobs via `ray exec`, see https://ray.readthedocs.io/en/latest/api.html#ray-exec. See https://github.com/ray-project/ray/blob/master/ci/long_running_tests/start_workloads.sh for an example of how to use it. – Robert Nishihara Apr 13 '19 at 23:40
  • Hey Robert, put your answers in the answer section so I can mark them correct and stuff... – Duane Apr 14 '19 at 03:47

1 Answers1

2

Here's a minimal example.

In the more verbose examples, the defaults should be good so you shouldn't need to change very much.

Also, if you already have a Ray cluster running and you started it with the autoscaler, you can submit jobs via ray exec, see the relevant documentation. See this script for an example of how to use it.

Robert Nishihara
  • 3,276
  • 16
  • 17