0

I am trying to launch a ray cluster and I am using Poetry as my package manager.

Problem is when I run poetry run ray up config.yaml I get

New status: setting-up
  [4/7] Running initialization commands
Shared connection to XX.XX.XX.XX closed.
  [5/7] Initalizing command runner
  [6/7] No setup commands to run.
  [7/7] Starting the Ray runtime
bash: ray: command not found
Shared connection to XX.XX.XX.XX closed.  
New status: update-failed
  !!!
  SSH command failed.
  !!!
  
  Failed to setup head node.

However if I install the exact same version of ray using pip as a user package and run ray up config.yaml everything works in setup. Problem is that remote actors don't use the package versions I have in my virtual environment (even if my virtual environment is activated when I run the script that calls ray.init()).

I have tried activating the virtual environment via intialization_commands and setup_commands in the config.yaml to no avail.

Does anyone know how to setup a ray cluster using a specified virtual environment (ray is installed in virtual environment but not as a user or global package). Any help would be appreciated!

The cluster is a SLURM cluster if that makes any difference. Every machine shares its file system so the virtual environment is on every machine.

HashBr0wn
  • 387
  • 1
  • 11

1 Answers1

0

I worked out the issue with a little hack it seems. Rather than specifying any setup_commands, sourcing the virtual environment in the .bashrc file did the trick. (Slightly annoying because if I am in a shell on one of those machines where I don't need the virtual environment I now have to deactivate it everytime)

HashBr0wn
  • 387
  • 1
  • 11