3

I'm setting up an AWS MWAA instance and I have a problem with import custom plugins.

My local project structure looks like this:

airflow-project
├── dags 
│   └── dag1.py  
└── plugins
    ├── __init__.py
    └── operators
        ├── __init__.py
        └── customopertaor.py

I tried to match this structure in the s3 bucket:

s3://{my-bucket-name}
└── DAGS
    ├── dags 
    │   └── dag1.py  
    └── plugins
        ├── __init__.py
        └── operators
            ├── __init__.py
            └── customopertaor.py

However when I use the custom operator on the local project the import works like this -

from operators import customOperators

and on the MWAA it only recognize imports like this -

from plugins.operators import customOperators

Is there a way to get the MWAA recognize the import as the local (from operators)? should I upload the files in certain way to the s3?

I also tried to upload a plugins.zip file but it didn't work:

s3://{my-bucket-name}
├── DAGS
│   └── dags 
│       └── dag1.py  
└── plugins.zip
Tal Meridor
  • 31
  • 1
  • 2

5 Answers5

1

I had the same problem and i solve it looking inside my .zip file. In my case the structure inside .zip file creates an extra folder called plugins. Review this using unzip -l plugins.zip and look the tree generated. This is my working structure:

Archive:  plugins.zip
Length      Date    Time    Name
    0  10-18-2021 11:39   hooks/
  125  10-18-2021 11:40   hooks/my_airflow_hook.py
    0  10-18-2021 11:40   sensors/
  359  10-18-2021 11:40   sensors/my_airflow_sensor.py
  395  10-18-2021 13:28   my_airflow_plugin.py
    0  10-18-2021 11:42   operators/
  437  10-18-2021 11:42   operators/hello_operator.py
  480  10-18-2021 11:42   operators/my_airflow_operator.py
gkimer
  • 61
  • 1
  • 3
0

I believe the proper way is to place your custom python modules in the plugin.zip file. This file will be uploaded to MWAA and gets extracted to /usr/local/airflow/plugins/. I believe the DAGs are placed in the very same folder.

AWS has published a User Guide that gives some good explanation and examples.

dovregubben
  • 364
  • 2
  • 16
0

you can import the plugin as a python-module like below

import imp

customopertaor = imp.load_source('customopertaor','/usr/local/airflow/plugins/operators/customopertaor.py') 
0

Your plugin folder tree looks good.

You need to restart your airflow environment to take the new plugin into account.

Alternatively you can use the config reload_on_plugin_change.

Hugo
  • 1,195
  • 2
  • 12
  • 36
0

Elaborating on @gkimer's answer, which opened my eyes to what I was doing wrong. I guess you first attempted zip -r plugins.zip plugins_folder and got this structure:

$ unzip -l plugins.zip
Archive:  plugins.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  03-02-2023 15:05   plugin_folder/__init__.py
        0  03-02-2023 15:11   plugin_folder/operators/
        0  03-02-2023 15:06   plugin_folder/operators/__init__.py
     6577  03-07-2023 10:05   plugin_folder/operators/my_operator.py
      427  03-07-2023 10:05   plugin_folder/plugin.py
---------                     -------
     7004                     5 files

You can do that but then you must use from plugin_folder.operators import * rather than from operators import *.

You have to do zip -r plugins.zip plugins_folder/* and then you obtain

$ unzip -l plugins.zip 
Archive:  plugins.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  03-02-2023 15:05   __init__.py
        0  03-02-2023 15:11   operators/
        0  03-02-2023 15:06   operators/__init__.py
     6577  03-07-2023 10:05   operators/my_operator.py
      427  03-07-2023 10:05   plugin.py
---------                     -------
     7004                     5 files

In other words, whatever zip tool you use, take care that your __init__.py and plugins.py files are at the root of the zip file. I actually prefer the former way so you specify the plugin explicitly on each import.

AdagioMolto
  • 143
  • 9