1

I want to:

  1. Have multiple python environments in my pyspark dataproc cluster
  2. Specify while submitting the job which environment I want to execute my submitted job in
  3. I want to persist the environments so that I can use them on an as-needed basis. I won't tear down the cluster but I would occasionally stop it. I want the environments to persist the way they do on a normal VM

Currently, I know how to submit the job with the entire environment with a conda pack but, the problem with that is it would ship the entire environment payload each time I want to submit the job and does not address the issue of handling multiple environments for projects

figs_and_nuts
  • 4,870
  • 2
  • 31
  • 56

0 Answers0