0

I start experimenting with Google Cloud Composer where I deploy few DAGs:

enter image description here

One of my DAG with an info statement indicating This DAG seems to be existing only locally. The master scheduler doesn't seem to be aware of its existence. cannot run, even manually. When I start it manually it stays on state "running" forever and never start to run the first task.

As explained in detail below the only difference between the two DAGs is that the broken one is using a custom operator.

Do you have any idea what's wrong here and how I can fix it ?

Thanks

  1. hello2_gcp_plugins_v2 is calling the only bash and email operator is working as expected (I received the email). If I configure a scheduler_interval it's starting as expecting. Even if I set up the scheduler interval to None, it's working well when I start it manually
  2. hello2_gcp_plugins_v5 is calling a custom operator that I already deploy in the expecting bucket. The custom operator just calls an API via the HttpHook to get data and upload it to gcs bucket via the GoogleCloudStorageHook. Whatever the scheduler interval is set up or keep to None, I always see the info statement in the UI and the DAG never start automatically. When started manually it stays in running state forever and the first task is never triggered.
AbdulAhmad Matin
  • 1,047
  • 1
  • 18
  • 27
Thibault Clement
  • 2,360
  • 2
  • 13
  • 17

2 Answers2

3

I answer myself to my question as I fix it and may be useful if someone else is getting into the same trouble.

Even if it's not obvious the following information This DAG seems to be existing only locally. The master scheduler doesn't seem to be aware of its existence. was due to a buggy operator use in my DAG. In my case, one of my custom operator.

To debug it, I click on the DAG -> Graph View -> Click on my custom operator -> Task Instance Details and the stacktrace of the error in my operator was display.

I fix my operator, upload the new version in the GCS bucket and after few refresh the Web UI didn't mention the information message anymore and my DAG was running.

Thibault Clement
  • 2,360
  • 2
  • 13
  • 17
0

this can also happen if you add a new dag without stopping the scheduler and it hasn't run the refresh on the dags folder to find the new dags yet. You can change the scheduler refresh time in the airflow.cfg to make it refresh quicker.

someRandomGuy
  • 137
  • 1
  • 11