Questions tagged [airflow]

Apache Airflow is a workflow management platform to programmatically author, schedule, and monitor workflows as directed acyclic graphs (DAGs) of tasks.

Airflow is a workflow scheduler. It was developed by Airbnb to manage its complicated workflows.

References

Related Tags###

Similar workflow schedulers:

10104 questions
32
votes
4 answers

Python script scheduling in airflow

Hi everyone, I need to schedule my python files(which contains data extraction from sql and some joins) using airflow. I have successfully installed airflow into my linux server and webserver of airflow is available with me. But even after going…
Abhishek Pansotra
  • 947
  • 2
  • 13
  • 17
32
votes
4 answers

What does the landing time mean in airflow?

There is a section called "landing time" in the DAG view on the web console of airflow. An example screen shot taken from airbnb's blog: But what does it mean? There is no definition in the documents or in their repository.
user6442810
  • 558
  • 1
  • 4
  • 8
31
votes
3 answers

Use case of dummy operator

I was learning apache airflow and found that there is an operator called DummyOperator. I googled about its use case, but couldn't find anything that I can understand. Can anyone here please discuss its use case?
Nabin
  • 11,216
  • 8
  • 63
  • 98
31
votes
4 answers

Make custom Airflow macros expand other macros

Is there any way to make a user-defined macro in Airflow which is itself computed from other macros? from airflow import DAG from airflow.operators.bash_operator import BashOperator dag = DAG( 'simple', schedule_interval='0 21 * * *', …
mxxk
  • 9,514
  • 5
  • 38
  • 46
30
votes
3 answers

Get Exception details on Airflow on_failure_callback context

Is there any way to get the exception details on the airflow on_failure_callback? I've noticed it's not part of context. I'd like to create a generic exception handling mechanism which posts to Slack information about the errors, including details…
nervokid
  • 401
  • 1
  • 4
  • 8
30
votes
3 answers

Where do you view the output from airflow jobs

In the airflow tutorial, the BashOperators have output (via echo). If the task runs in the scheduler, where do you view the output? Is there a console or something? I'm sure I'm just not looking in the right place.
Dan
  • 45,079
  • 17
  • 88
  • 157
30
votes
6 answers

Debugging Broken DAGs

When the airflow webserver shows up errors like Broken DAG: [] , how and where can we find the full stacktrace for these exceptions? I tried these locations: /var/log/airflow/webserver -- had no logs in the timeframe of…
arbazkhan002
  • 1,283
  • 2
  • 13
  • 18
30
votes
3 answers

How to use AirFlow to run a folder of python files?

I have a series of Python tasks inside a folder of python files: file1.py, file2.py, ... I read the Airflow docs, but I don't see how to specify the folder and filename of the python files in the DAG? I would like to execute those python files (not…
tensor
  • 3,088
  • 8
  • 37
  • 71
29
votes
4 answers

I am getting "bash: airflow: command not found"

I am getting -bash: airflow: command not found after installing Apache Airflow. I am using Google Cloud Compute Engine and OS is Debian 9 (Stretch). I have followed the below steps: export AIRFLOW_HOME=~/airflow pip install apache-airflow
Md Sirajus Salayhin
  • 4,974
  • 5
  • 37
  • 46
28
votes
1 answer

Why do I get no such table error when installing Apache Airflow on Mac?

It was so hard to put that right title. Ok, here it goes. I was following this tutorial to install Apache Airflow on my Mac (Mojave version) - https://towardsdatascience.com/getting-started-with-apache-airflow-df1aa77d7b1b Right at the first step…
VKarthik
  • 1,379
  • 2
  • 15
  • 30
28
votes
2 answers

Airflow S3KeySensor - How to make it continue running

With the help of this Stackoverflow post I just made a program (the one shown in the post) where when a file is placed inside an S3 bucket a task in one of my running DAGs is triggered and then I perform some work using the BashOperator. Once it's…
Kyle Bridenstine
  • 6,055
  • 11
  • 62
  • 100
28
votes
2 answers

Airflow Python Unit Test?

I'd like to add some unit tests for our DAGs, but could not find any. Is there a framework for unit test for DAGs? There is an End-to-End testing framework that exists but I guess it's dead: https://issues.apache.org/jira/browse/AIRFLOW-79. Please…
Chengzhi
  • 2,531
  • 2
  • 27
  • 41
27
votes
2 answers

With code, how do you update an airflow variable?

I need to update a variable I have made in Airflow programmatically but I can not find the answer on how to do that with code. I have retrieved my variable with this code: column_number = Variable.get('column_number') At the end of the function, I…
Justin Besteman
  • 378
  • 1
  • 6
  • 17
27
votes
5 answers

Airflow: Creating a DAG in airflow via UI

Airflow veterans please help, I was looking for a cron replacement and came across apache airflow. We have a setup where multiple users should be able to create their own DAGs and schedule their jobs. Our users are a mix of people who may not know…
Mukul Jain
  • 1,807
  • 9
  • 26
  • 38
27
votes
2 answers

Airflow scheduler is slow to schedule subsequent tasks

When I try to run a DAG in Airflow 1.8.0 I find that it takes a lot of time between the time of completion predecessor task and the time at which the successor task is picked up for execution (usually greater the execution times of individual…
Prasann
  • 460
  • 1
  • 5
  • 13