I use dask clusters to process dataset. My dask python project structure is as follows:
project_name/
folder_a
a.py
folder_b
b.py
I import function function_a()
within a.py
in b.py
like follows:
from folder_a.a import function_a
@delayed
def test(input_str):
return function_a(input_str)
client = ...
d_task = [test(str)]
dask.compute(d_task)
However, dask worker nodes throw exception ModuleNotFoundError: No module named 'folder_a'
I have tried many solutions like add PYTHONPATH={dir}/project_name
while none of them are available.
How to solve this problem? Is there any way that can add {dir}/project_name
to dask worker environment?