3

I am trying to create a Cloud Composer DAG to be triggered via a Pub/Sub message. There is the following example from Google which triggers a DAG every time a change occurs in a Cloud Storage bucket: https://cloud.google.com/composer/docs/how-to/using/triggering-with-gcf

However, on the beginning they say you can trigger DAGs in response to events, such as a change in a Cloud Storage bucket or a message pushed to Cloud Pub/Sub. I have spent a lot of time try to figure out how that can be done, but no result.

Can you please help or giving me some directions? Thanks!

harry77
  • 107
  • 2
  • 9

2 Answers2

6

There are 2 ways to trigger a DAG by a Pub/Sub events.

  1. You can place a PubSubPullSensor at the beginning of your DAG. Your DAG will be triggered every time a Pub/Sub message can be pulled by the PubSubPullSensor. And it will execute the rest of the tasks in your DAG.
  2. You can also create a Cloud Function that acts as Pub/Sub trigger. And put the Composer DAG triggering logic inside the Cloud Function trigger. When a message is published to the Pub/Sub topic, the Cloud Function should be able to trigger the Composer DAG.
Ryan Yuan
  • 2,396
  • 2
  • 13
  • 23
  • I am working on the 2nd option, but I do not know what to put as `Endpoint URL` at the Pub/Sub subscription. Any help with that? – harry77 Oct 29 '19 at 00:32
  • @harry77 What do you mean by `Endpoint URL at the Pub/Sub subscription`. Which step are you up to? – Ryan Yuan Oct 29 '19 at 23:21
  • I tried the first option. Placed PubSubPullSensor in the beginning but DAG couldn't trigger itself automatically after new messages were published on the Pub/Sub topic. Looks like it's just an integration option which enables you to read messages published on Pub/Sub topic. – Balajee Venkatesh Jan 30 '20 at 05:38
  • @BalajeeVenkatesh If I understand your problem correctly, the solution to that is to kick off the DAG that contains the PubSubPullSensor beforehand. Therefore, the DAG will always be there listening to the Pub/Sub messages. – Ryan Yuan Jan 30 '20 at 05:43
  • PubSubPullSensor is a module which can be leveraged for reading messages from Pub/Sub subscription. Eventually we create a "task" which incorporates this module. Kicking off DAG beforehand enables the "Pull task" to start and keep waiting for a new message on Pub/Sub topic. The task is over the moment it reads all the messages and it leads to the execution of next task of the DAG. It means the first task won't be ON always and so the DAG won't be there to listen the messages forever. – Balajee Venkatesh Jan 30 '20 at 06:33
  • 1
    @BalajeeVenkatesh That's correct. So when it receives/senses the message, it should execute the next task as well as trigger the "sensor" DAG itself again. – Ryan Yuan Jan 30 '20 at 11:50
0

To extend the public documentation page you already posted, you can configure a Cloud Function to run each time a message is published to a Cloud Pub/Sub topic. There is more information about that in another public documentation page.

To attach a function to a topic, set the --trigger-topic flag when deploying the function:

gcloud functions deploy $FUNCTION_NAME --runtime $RUNTIME --trigger-topic $TOPIC_NAME
hexacyanide
  • 88,222
  • 31
  • 159
  • 162