0

I set up my DAG in docker container as follows:

with DAG(
    "my_dag",
    default_args=dag_config,
    schedule="@weekly",
) as dag:

    config_env = BashOperator(
        task_id="config_env",
        bash_command="cd /opt/airflow/include/my_package && python -m pip install -e .",
    )

    @task()
    def get_my_function():
        from my_package import my_function

        my_function()

When running this DAG, it returns

ModuleNotFoundError: No module named 'my_package'

However, if I go into the docker container using Python shell do the import, it works with no problem.

from my_package import my_function

which suggest that my_package is successfully installed in the environment.

I also checked if Python shell and airflow is using the same Python enviroment. For Python shell, I use which python.

For Airflow DAG, I modified the task as follows:

@task()
def get_my_function():]
    import sys

    print("Python Executable Path:", sys.executable)
    print("Python Version:", sys.version)
    print("sys.path:", sys.path)

    from my_package import my_fucntion

Python Executable are the same /usr/local/bin/python. I also tried to use PythonOperator instead of taskFlow API. The same error returns.

  • Airflow 2.5.1
  • Python 3.9
panday1995
  • 11
  • 3

1 Answers1

1

The problem is that Airflow cannot find package source in the ./lib/site-packages path. Problem solved by adding the source package code into sys.path, i.e.,

in airflow dag, write following:

import sys

sys.path.append("/opt/airflow/include/my_package/src")

with DAG(
    "my_dag",
    default_args=dag_config,
    schedule="@weekly",
) as dag:

    config_env = BashOperator(
        task_id="config_env",
        bash_command="cd /opt/airflow/include/my_package && python -m pip install -e .",
    )

    @task()
    def get_my_function():
        from my_package import my_function

        my_function()

panday1995
  • 11
  • 3