Wondering if there is any way (blueprint) to trigger an airflow dag in MWAA on the merge of a pull request (preferably via github actions)? Thanks!
2 Answers
(Answering for Airflow without specific context to MWAA)
Airflow offers rest API which has trigger dag end point so in theory you can configure GitHub action that will run after merge of PR and trigger a dag run via REST call. In theory this should work.
In practice this will not work as you expect.
Airflow is not synchronous with your merges (even if merged dump code in the dag folder and there is no additional wait time for GitSync). Airflow has a DAG File Processing service that scans the Dag folder and lookup for changes in files. It process the changes and then a dag is registered to the database. Only after that Airflow can use the new code. This seralization process is important it makes sure different parts of airflow (webserver etc..) don't have access to your dag folder.
This means that if you invoke dagrun right after merge you are risking that it will execute an older version of your code.
I don't know what why you need such mechanism it's not very typical requirement but I'd advise you to not trying to force this idea into your deployment.
To clarify: If under a specific deployment you can confirm that the code you deployed is parsed and register as dag in the database then there is no risk in doing what you are after. This is probably a very rare and unique case.

- 14,110
- 2
- 17
- 49
-
as far as I know MWAA does not expose the api as Airflow does, but it works with the CLI. – ozs Jan 09 '23 at 13:24
-
@ozs added note answering about Airflow but this doesn't matter. It's not recommended in Airflow thus it's also not recommended in MWAA. This is not about how to get this done it's about this is not a right approach and it's not going to work as expected for the reasons listed in the answer. If one understand the implications and willing to go ahead then that is another matter. – Elad Kalif Jan 09 '23 at 13:26
-
you right about what you are saying. but if the change is not in airflow itself but in other repo that airflow execute (repo that on merge create docker) then I don't see a problem. – ozs Jan 09 '23 at 13:41
-
@ozs Not sure i understand that type of deployment (among other things how you protect against broken dags) but I'll add clarification to my answer in any case – Elad Kalif Jan 09 '23 at 14:24
-
Elad and @ozs, thanks for your answers and great discussion. I have a dbt project. The plan is that CI/CD will first sync the code base with S3 which is a MWAA dag-folder. Here it says "When the environment updates, Amazon MWAA downloads the dbt directory to the local usr/local/airflow/dags/ folder." (https://docs.aws.amazon.com/mwaa/latest/userguide/samples-dbt.html#samples-dbt-upload-project). Then it will trigger the dag is to build models in dbt. So the question is how quick MWAA to pick up the new environment I guess? – AlizaminJ Jan 09 '23 at 23:09
You need to create a role in AWS :
set permission with policy airflow:CreateCliToken
{ "Action": "airflow:CreateCliToken", "Effect": "Allow", "Resource": "*" }
Add trusted relationship (with your account and repo)
{ "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "Federated": "arn:aws:iam::{account_id}:oidc-provider/token.actions.githubusercontent.com" }, "Action": "sts:AssumeRoleWithWebIdentity", "Condition": { "StringLike": { "token.actions.githubusercontent.com:sub": "repo:{repo-name}:*" } } } ] }
In github action you need to set AWS credential with role-to-assume and permission to job
permissions: id-token: write contents: read - name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v1 with: role-to-assume: arn:aws:iam::{ account_id }:role/{role-name} aws-region: {region}
- Call MWAA using the CLI see aws ref about how to create token and run dag.

- 3,051
- 1
- 10
- 19