1

There are 16 vCPUs on my vertex AI Jupyter notebook, and I am writing a parallelized script. I wasn't sure if the right approach was to hardcode in parallel processing based on the number of vCPUs (and if so, how to choose nworkers vs nthreads). Should I do this

from dask.distributed import Client
num_vcpus = 16
dask_client = Client(threads_per_worker=1, n_workers=nvcpus)

or should I use the cloud deployment pipeline as outlined here instead? What degree of parallelization by hand (setting up futures and tasks and partitions and then running gather) is required? For context, I don't know what Kubernetes is.

Tanishq Kumar
  • 263
  • 1
  • 13

0 Answers0