6

I'm running a DAG test_dag.py which is structured in the following way in my Google Cloud Storage Bucket.

gcs-bucket/
    dags/
        test_dag.py
        dependencies/
            __init__.py
            dependency_1.py
            module1/
                __init__.py
                dependency_2.py

Airflow detects the DAG, test_dag.py, which tries to import from depencies/dependency_1.py, (which imports successfully) and dependencies/module1/dependency_2.py which gives the error Broken DAG: [/home/airflow/gcs/dags/test_dag.py] module 'dependencies' has no attribute 'module1'.

The line causing this is from dependencies.module1 import dependency_2.

This seems to indicate to me that Cloud Composer is unable to import from a subdirectory within dependencies/, and when I look at their dependencies documentation here, the example they give is only one directory level down from /dags (and is only 1 file rather than being a full python package).

Here is the weird part though -- it runs successfully when I run this locally in Airflow (not on Cloud Composer). So I'm at a loss for why my imports would work locally but not on Cloud Composer.

I've also tried importing everything from within my __init__.py files, which gives me the same attribute error, and moving my dependencies a level up into gcs-bucket/ where they can't seems to be found at all.

When I print out __file__ from with my DAG I get /home/airflow/gcs/dags/test_dag.py and when I print sys.path I get:

['/usr/local/bin', '/opt/python3.6/lib/python36.zip', '/opt/python3.6/lib/python3.6', '/opt/python3.6/lib/python3.6/lib-dynload', '/opt/python3.6/lib/python3.6/site-packages', '/usr/local/lib/airflow', '/home/airflow/gcs/dags', '/etc/airflow/config', '/home/airflow/gcs/plugins']

I'm totally at a loss here, any help would be much appreciated. Thank you.

EDIT: It seems that Cloud Composer does not like when dependencies try to import other dependencies (see comments below). Wondering if there is some way around this?

Matt
  • 1,368
  • 1
  • 26
  • 54
  • Is it possible for you to make your dependencies code available? – rmesteves May 06 '20 at 10:07
  • It's a bit tough to do that to be honest, this is a minimum viable example but in reality the dependencies directory is a huge package of interconnected pieces. If there is a specific part you'd like to see I can copy it over. The reason I don't think it necessarily has to do with that code though, is that it runs fine when I run it locally, so I don't think the way the imports are structured is the issue here – Matt May 06 '20 at 16:08
  • I tried reproducing your problem but it worked fine for me. I could import both a module in a one level directory and in a 2 level directory – rmesteves May 07 '20 at 10:14
  • Can you check if all the dependencies and import names are right? – rmesteves May 07 '20 at 10:22
  • I’ll double check everything but can I ask two one follow ups to that — are you able to import dependency 1 from within dependency 2? And more importantly did you have to include a ‘setup.py’ file or anything else that I might have forgotten to make the imports work? – Matt May 07 '20 at 11:59
  • I just don’t understand how my imports can be wrong in cloud composer but it’s able to run locally on Airflow :/ – Matt May 07 '20 at 12:00
  • And to confirm, you put your dependency packages with /dags, correct? – Matt May 07 '20 at 12:01
  • The dependency directory is on my dags/ folder. Did you try to import dependency 1 inside dependency 2? I imported both on the dag. If you tried to import dependency 1 inside dependency 2 I think it changed the situation. If you confirm I can test it too – rmesteves May 07 '20 at 14:41
  • Yes I have done that I would be curious if it works on your end – Matt May 07 '20 at 14:54
  • I tested it now and had the same problem as you. Can this structure of dependencies be changed? – rmesteves May 07 '20 at 16:18
  • That's a relief. I'm glad to hear that I'm not totally crazy. The package is honestly so interwoven at this point that it would be a huge job to restructure it entirely. Do you happen to know of any alternative ways that this can work? Maybe through plugins or some other way of importing the dependency? – Matt May 07 '20 at 16:32
  • Can you confirm that you are trying to import dependence 1 within dependence 2? Because in your question you said "The line causing this is from dependencies.module1 import dependency_2." – rmesteves May 08 '20 at 10:22
  • @Matt - I just wondering if you resolve this issue? I'm facing the exact same problem :( – Qorbani Nov 07 '20 at 17:00
  • @Qorbani I did not. I was never able to get it to work – Matt Nov 07 '20 at 17:01
  • Anyone solved that? Seems impossible Cloud composer Is unable to del with custom modules – Giorgio Aug 25 '22 at 03:14

1 Answers1

1

Could you add __init__.py under dags/ folder and give a try?

Nandakishore
  • 981
  • 1
  • 9
  • 22