0

When working on a local project, from local_project.funcs import local_func will fail in the cluster because local_project is not installed.

This forces me to develop everything on the same file.

Solutions? Is there a way to "import" the contents of the module into the working file so that the cluster doesn't need to import it?

Installing the local_project in the cluster is not development friendly because any change in an imported feature requires a cluster redeploy.

import dask
from dask_kubernetes import KubeCluster, make_pod_spec
from local_project.funcs import local_func

pod_spec = make_pod_spec(
    image="daskdev/dask:latest",
    memory_limit="4G",
    memory_request="4G",
    cpu_limit=1,
    cpu_request=1,
)
cluster = KubeCluster(pod_spec)

df = dask.datasets.timeseries()
df.groupby('id').apply(local_func)  #fails if local_project not installed in cluster

Nuno Silva
  • 108
  • 10

1 Answers1

1

Typically the solution to this is to make your own docker image. If you have only a single file, or an egg or zip file then you might also look into the Client.upload_file method

MRocklin
  • 55,641
  • 23
  • 163
  • 235
  • 1
    to automate in code I did: `_ = subprocess.check_call(["python", "setup.py", "bdist_egg"])` and then `client.upload_file(glob.glob("./dist/*.egg")[0])` – Nuno Silva Jul 20 '20 at 14:43