I am trying to set up prefect 2 with dask to run some tasks in django. One simple example with not much code runs fine but a larger example fails with:
distributed.worker - ERROR - Could not deserialize task ****-****-****-****
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/distributed/worker.py", line 2185, in execute
function, args, kwargs = await self._maybe_deserialize_task(ts)
File "/usr/local/lib/python3.8/site-packages/distributed/worker.py", line 2158, in _maybe_deserialize_task
function, args, kwargs = _deserialize(*ts.run_spec)
File "/usr/local/lib/python3.8/site-packages/distributed/worker.py", line 2835, in _deserialize
kwargs = pickle.loads(kwargs)
File "/usr/local/lib/python3.8/site-packages/distributed/protocol/pickle.py", line 73, in loads
return pickle.loads(x)
File "/tmp/tmpb11md_6wprefect/my_project/models/__init__.py", line 2, in <module>
from my_project.models.my_model import *
File "/tmp/tmpb11md_6wprefect/my_project/models/api_key.py", line 14, in <module>
class ApiKey(models.Model):
File "/usr/local/lib/python3.8/site-packages/django/db/models/base.py", line 108, in __new__
app_config = apps.get_containing_app_config(module)
File "/usr/local/lib/python3.8/site-packages/django/apps/registry.py", line 253, in get_containing_app_config
self.check_apps_ready()
File "/usr/local/lib/python3.8/site-packages/django/apps/registry.py", line 136, in check_apps_ready
raise AppRegistryNotReady("Apps aren't loaded yet.")
django.core.exceptions.AppRegistryNotReady: Apps aren't loaded yet.
The problem here is that I use an init.py in a model directory instead of using multiple apps etc. To circumvent this error in other places I did set up the django environment at the beginning of each task, e.g.:
import datetime
import xyz
import django
os.environ.setdefault(
"DJANGO_SETTINGS_MODULE",
os.environ.get("DJANGO_SETTINGS_MODULE", "core.settings"),
)
django.setup()
# No we import the models
from my_project.models import MyModel
This worked fine so far. But the deserialization in dask does not seem to work this way. It seems it tries to import the model directory before running any code in the serialized task.
Is there any way I can fix this?