Questions tagged [airflow]

Apache Airflow is a workflow management platform to programmatically author, schedule, and monitor workflows as directed acyclic graphs (DAGs) of tasks.

Airflow is a workflow scheduler. It was developed by Airbnb to manage its complicated workflows.

References

Related Tags###

Similar workflow schedulers:

10104 questions
3
votes
1 answer

Airflow killing tasks that take too long with SIGKILL

I have a SQL Server database, in which I'm migrating to AWS S3 in the parquet format to build a data lake. I'm using Apache Airflow to automate this task using DAGS. Each table on schema, in this case, becomes a .parquet file, this serves for the S3…
Pj-
  • 430
  • 4
  • 14
3
votes
2 answers

Airflow 2 - debugging why dag is not loading

On Airflow 2 my dag is not showing on the UI, and I'm getting DAG Import Errors (...) for it. The error message is insufficient for me to debug (it's a custom operator, with a lot of custom logic - so I don't want to get into details of the error…
Grzegorz Skibinski
  • 12,624
  • 2
  • 11
  • 34
3
votes
1 answer

AWS MWAA (Managed Apache Airflow); Programmatically enable DAGs

We are using AWS MWAA. We add our DAG.py files to our S3 bucket programatically. They then show up in the UI. However, they are "OFF" and you must click the "ON" button to start them. EDIT: Also we may sometimes want to turn a DAG that's ON to OFF…
Tommy
  • 12,588
  • 14
  • 59
  • 110
3
votes
0 answers

How to send Airflow Metrics to datadog

We have a requirement where we need to send airflow metrics to datadog. I tried to follow the steps mentioned here https://docs.datadoghq.com/integrations/airflow/?tab=host Likewise, I included statsD in airflow installation and updated the airflow…
Vibhav
  • 181
  • 5
  • 11
3
votes
1 answer

Airflow trigger_rule for grandparent tasks

I have following dag task_a >> task_b>> task_c task_b has all_done trigger rule task_c has all_success trigger rule if task a fails, will task_c will get executed?
Alexander
  • 55
  • 6
3
votes
0 answers

airflow webserver failing everyttime

I am trying to setup airflow as systemd service. This is failing with following error everytime: airflow[28159]: args.func(args) airflow[28159]: File "/opt/py37/lib/python3.7/site-packages/airflow/cli/cli_parser.py", line 48, in…
Ajay
  • 41
  • 2
3
votes
1 answer

Where is the adhoc request option in Airflow 2.0.1?

I do not see an adhoc request in the dropdown within airflow 2.0.1. Does anyone have info on this? Was it removed from 2.0+? Any help will be greatly appreciated, I used this a lot.
RussBuss
  • 33
  • 2
3
votes
1 answer

Defining complex workflow dependency in airflow 2.0 taskflow API

Let's say I have the follow dummy DAG defined as below: @dag(default_args=default_args, schedule_interval=None, start_date=days_ago(2)) def airflow_taskflow_api_dag(): cur_day = '2020-01-01' @task() def A(current_day: str): …
3
votes
1 answer

Using apache airflow docker operator with rootless docker

I'm working on a project that is using apache airflow to schedule different scripts. Airflow itself and the scripts setting up the DAGs are bundled up in one Dockerfile. The central part of each DAG is a dockeroperator which starts the appropiate…
Messias423
  • 31
  • 1
3
votes
1 answer

Configure logging retention policy for Apache airflow

I could not find in Airflow docs how to set up the retention policy I need. Currently, all airflow logs have to be manually deleted, else they will be kept forever on our servers... Not the best way to go. I wish to create global logs…
Stempler
  • 1,309
  • 13
  • 25
3
votes
0 answers

Backup Postgres Database to Azure Blob storage using Airflow DAG

I am currently working on Postgressql database and have to create a database backup on a daily basis using Airflow DAG. I have created a DAG and I am able to create the backup using BashOperator and pg_dump command. But this approach reveals the…
3
votes
1 answer

AWS Managed Airflow - how to restart scheduler?

I have a problem parsing DAG with error: Broken DAG: [/usr/local/airflow/dags/test.py] No module named 'airflow.providers' I added apache-airflow-providers-databricks to requirements.txt, and see from the log that: Successfully installed…
3
votes
1 answer

nvalid syntax: Create table sortkey auto with initial sortkeys

I'm trying to use target-redshift to push data to aws-redshift https://pypi.org/project/target-redshift/ I am using airflow to monitor etl status This is error log and i have no clue what it means. Online documentation hardly exists for…
rojer_1
  • 55
  • 4
3
votes
1 answer

Dynamic DAGs build using Database information

I'm a newbie with Airflow and I'm trying to figuring out which is the best approach to dynamically create a set of DAGs using the information retrieved from a DB. Currently I've thougth this possible solution: # file: dags_builder_dag.py in…
3
votes
3 answers

Error installing apache-airflow: "Could not build wheels for setproctitle which use PEP 517 and cannot be installed directly"

I'm trying to find some help installing apache-airflow. I am on MacOS 10.15.7, Python version 3.8.2, and I keep getting an error: ERROR: Could not build wheels for setproctitle which use PEP 517 and cannot be installed directly I have tried using…
user15305521
  • 33
  • 1
  • 3