0

I'd like to cover three scenarios:

  1. I want to copy various configuration files to the head node
  2. I want to copy various data (e.g. images) to the workers, so that they can each work independently on it, without having to pass the data through the remote calls each time. These could be on the GCP cloud storage.
  3. I want to copy necessary binaries to the workers. These cannot be located on some shared storage or passed through remote calls and have to be actually present on each of the worker nodes.

What would be the best way to achieve this?

Muppet
  • 5,767
  • 6
  • 29
  • 39
  • Are you starting the cluster on GCP using the Ray autoscaler? This is probably the best way to manage cluster setup and configuration and handling things like syncing files. https://ray.readthedocs.io/en/latest/autoscaling.html#quick-start-gcp – Robert Nishihara Jun 08 '19 at 04:10
  • yes I am, but this doesn't really solve the issue of copying the files/kaing them accessible, right? I can run a startup script in setup_commands, but all files are private and need to be transferred somehow – Muppet Jun 08 '19 at 05:07
  • I can "ray rsync_up" files, but all three cases above need to be covered, and this doesn't really seem to be the right way to do it. I reckon doing it via the gcp storage is better, but I am not sure how this is supported by ray? – Muppet Jun 08 '19 at 05:17
  • ok, looks like it is possible with `file_mounts: { "/path1/on/remote/machine": "/path1/on/local/machine" }` – Muppet Jun 08 '19 at 19:49
  • Yes, I think `file_mounts` should do the trick. – Robert Nishihara Jun 09 '19 at 04:41

0 Answers0