I'm working on an Apache Airflow, container based application. My environment is made of the following components:
- Airflow Scheduler container
- Airflow Webserver container
- Airflow Celery Flower container
- Airflow Worker container (1)
- etc.
My understanding of this pattern is that I can have a scheduler and a webserver containers with just the necessary dependencies for Airflow, then I can have a worker node (or several) with everything I need to run my DAG.
When I try to work with it this way (for instance, adding and using a module in the worker node, let's say it's the crypto
module), I get a DAG Import Error
exception in the front end, that says the following:
ModuleNotFoundError: No module named 'crypto'
.
This makes sense to me, because the scheduler knows that I'll need that module for the execution and throws an error, despite this the DAG correctly work, because when it's run, in the worker node, it has all the required dependencies.
How can I fix this?
Thanks