0

I need to run the DAG with the repository folder name, and I need to call the other modules from another directory from another path repository deployed.

So, I have a cloudbuild.yaml that will deploy the script into DAG folder and Plugins folder, but I still didn't know, how to get the other modules from the other path on Cloud Composer Bucket Storage.

This is my Bucket Storage path

cloud-composer-bucket/
    dags/
        github_my_repository_deployed-testing/
            test_dag.py
    plugins/
        github_my_repository_deployed-testing/
            planning/
                modules_1.py

I need to call modules_1.py from my test_dag.py, I used this command to call the module

from planning.modules_1 import get_data

But from this method, I got an error shown like this

Broken DAG: [/home/airflow/gcs/dags/github_my_repository_deployed-testing/test_dag.py] Traceback (most recent call last):
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/airflow/gcs/dags/github_my_repository_deployed-testing/test_dag.py", line 7, in <module>
    from planning.modules_1 import get_date
ModuleNotFoundError: No module named 'planning'

This is my cloudbuild.yaml

steps:
- id: 'Push into Composer DAG'
  name: 'google/cloud-sdk'
  entrypoint: 'sh'
  args: [ '-c', 'gsutil -m rsync -d -r ./dags ${_COMPOSER_BUCKET}/dags/$REPO_NAME']
- id: 'Push into Composer Plugins'
  name: 'google/cloud-sdk'
  entrypoint: 'sh'
  args: [ '-c', 'gsutil -m rsync -d -r ./plugins ${_COMPOSER_BUCKET}/plugins/$REPO_NAME']
- id: 'Code Scanning'
  name: 'python:3.7-slim'
  entrypoint: 'sh'
  args: [ '-c', 'pip install bandit && bandit --exit-zero -r ./']
substitutions:
    _CONTAINER_VERSION: v0.0.1
    _COMPOSER_BUCKET: gs://asia-southeast1-testing-cloud-composer-025c0511-bucket

My question is, what is the best and how to call the other modules into DAG?

MADFROST
  • 1,043
  • 2
  • 11
  • 29

1 Answers1

2

You can put every modules in the Cloud Composer DAG folder, example :

cloud-composer-bucket/
    dags/
        github_my_repository_deployed-testing/
            test_dag.py
        planning/
            modules_1.py
        
        setup.py

On the DAG Python code, you can import your module with the following way :

from planning.modules_1 import get_data

As I remembered, the setup.py is created by Cloud Composer in the DAG root folder, if it's not the case, you can copy the setup.py in the DAG folder :

bucket/dags/setup.py

Example of setup.py file :

from setuptools import find_packages, setup

setup(
    name="composer_env_python_lib",
    version="0.0.1",
    install_requires=[],
    data_files=[],
    packages=find_packages(),
)

Other possible solution

You can also use internal Python packages from GCP Artifact registry if you want (example with your package planning).

Then you can download your internal Python packages from Cloud Composer via PyPiPackages, I share with you a link about this :

private repo Composer Artifact registry

Mazlum Tosun
  • 5,761
  • 1
  • 9
  • 23
  • Thanks for the response, what is inside in `setup.py` ? – MADFROST Nov 16 '22 at 10:32
  • You're welcome. I edited my answer to give you more information about the `setup.py` file. I also proposed you a second solution if you are interested. – Mazlum Tosun Nov 16 '22 at 10:47
  • Hey, thank you! So I'm curious here, why I can't import the python script from `Plugins` folder? If I read on this [Composer Bucket](https://cloud.google.com/composer/docs/concepts/cloud-storage#folders_in_the_bucket), it was said `Stores your custom plugins, such as custom in-house Airflow operators, hooks, sensors, or interfaces.` – MADFROST Nov 17 '22 at 01:41
  • You're welcome. If you use custom `Airflow` operator, it's not an obligation to use the `Plugins` folder. You can also put your custom operators in your folder copied in the Airflow `dags` `bucket` folder, example : `{composer_bucket}/dags/your_dag_folder/operators` – Mazlum Tosun Nov 17 '22 at 14:42
  • Yes, I've tried that, but it got an error, what command to import from `[composer_bucket]/plugins/planning/modules_1.py`? I've used this `from plugins.planning import modules_1`, but it sent an error – MADFROST Nov 18 '22 at 06:06