3

I am getting:

KilledWorker: ("('from_pandas-1445321946b8a22fc0ada720fb002544', 4)", 'tcp://127.0.0.1:45940')

I've read the explanation about the latter error message, but this is all confusing coming together with the error message at the top of the stacktrace:

distributed.utils - ERROR - Worker already exists tcp://127.0.0.1:35780

The actual errors piped to the terminal running the Jupyter notebook command for my notebook:

ModuleNotFoundError: No module named '_cython_magic_faba6120a194ab58ae9efd1da474433f'

So I will look into how to solve this myself, now that I found the detailed error in my case. A pinpointed tip about this peculiar configuration would be nice, but I guess it is just more sensible to extract all cython code to python code outside the notebook, rather than hammer dask into knowing about cython magic commands?

matanster
  • 15,072
  • 19
  • 88
  • 167

2 Answers2

7

Here's a complete toy example (tested on JupyterLab using a SLURM cluster). The example compiles with Cython a trivial function that sums two integers, but of course one can apply the same technique to complex (and more useful) code.
The key trick here is that one has to set up the Workers to find and import the Cython library.
This requires importing pyximport, calling pyximport.install(), and then importing the Cython-generated module on each Worker. This is done using register_worker_callback(). Note that the Cython-generated module is placed in the <IPYTHONCACHEDIR/cython directory (IPYTHONCACHEDIR can be found by calling IPython.paths.get_ipython_cache_dir()). The directory must be added to the paths where Python looks for modules, so that the Cython-generated module can be loaded.
This example assumes SLURM, but it is just for my convenience. The dask.distributed "network" could be set up with any other method (see for instance http://distributed.dask.org/en/latest/setup.html).

from dask import delayed

%load_ext cython

# Create a toy Cython function and put it into a module named remoteCython
%%cython -n remoteCython
def cython_sum(int a, int b):
    return a+b

# Set up a distributed cluster (minimal, just for illustration)
# I use SLURM.
from dask_jobqueue import SLURMCluster
from distributed import Client

cluster = SLURMCluster(memory="1GB",
                       processes=1,
                       cores=1,
                       walltime="00:10:00")

cluster.start_workers(1)   # Start as many workers as needed.



client = Client(cluster)

def init_pyx(dask_worker):
    import pyximport
    pyximport.install()

    import sys
    sys.path.insert(0,'<IPYTHONCACHEDIR>/cython/')   # <<< replace <IPYTHONCACHEDIR> as appropriate

    import remoteCython

client.register_worker_callbacks(init_pyx)  # This runs init_pyx() on any Worker at init

import remoteCython

# ASIDE: you can find where the full path of Cython-generated library by
# looking at remoteCython.__file__

# The following creates a task and submits to the scheduler.
# The task computes the sum of 123 and 321 via the Cython function defined above
future = client.compute(delayed(remoteCython.cython_sum)(123,321)) 

# The task is executed on the remote worker

# We fetch the result from the remote worker
print(future.result())   # This prints 444

# We're done. Let's release the SLURM jobs.
cluster.close()
P.Toccaceli
  • 771
  • 8
  • 9
  • 1
    Thanks a lot. Obviously I abandoned this at the time, but I think this answer should be a valuable resource now. I wonder how does slurm fit in, I have only lately become aware of it, and it would be extra-nice if you could add few more words about the integration up there or just in a short comment. E.g., would one need much more than the above to run this over a slurm cluster spanning many machines, meaning as a distributed rather than local-concurrent job? – matanster May 13 '19 at 16:52
1

The specific cython error does indeed look like it comes from the problem of configuring compilation to be visible to workers. When you do %%cython, a temporary extension is created and built, end imported into the local (client) session without being installed into the python environment. Exactly how that happens I am not sure.

You should at the very least ensure that you create your client after you compile your cython cell, then they may inherit the required environment, but there's a decent chance that the monkey patching by the cell magic is too complex to work in any case.

mdurant
  • 27,272
  • 5
  • 45
  • 74
  • Thanks a lot. So generally speaking, if we take Jupyter cell magic out of the picture, and all our cython code is defined in python library code (and only, possibly, run from there in a Jupyter notebook), are we safe using our cython functions in dask `apply`? – matanster Aug 24 '18 at 17:46
  • 1
    cython functions should serialise or be importable like any other python function, and so successfully make it into the workers' memory space. – mdurant Aug 24 '18 at 20:32