3

I am running Apache Airflow on docker. I want to install an airflow provider package for spark. My docker compose yml file looks like this. I want to add Spark as my connection type when i try to create a new connection in airflow. How can i do this?

sanchit08
  • 119
  • 1
  • 7

2 Answers2

3

You should create new, custom images and use it instead of the base images. See : https://airflow.apache.org/docs/docker-stack/build.html

Jarek Potiuk
  • 19,317
  • 2
  • 60
  • 61
3

You can now specify providers to install on startup using the default compose (without having to build custom image). You can do this by appending providers' pip package names to the environment variable _PIP_ADDITIONAL_REQUIREMENTS in the docker-compose file.

...
x-airflow-common:
  ...
  environment:
    &airflow-common-env
    AIRFLOW__CORE__EXECUTOR: CeleryExecutor
    AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: ...
    ...
    _PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:- apache-airflow-providers-docker apache-airflow-providers-microsoft-mssql}
volumes:
    ...

https://stackoverflow.com/a/68607370

Antoine Dahan
  • 574
  • 2
  • 9
  • 23