Questions tagged [airflow]

Apache Airflow is a workflow management platform to programmatically author, schedule, and monitor workflows as directed acyclic graphs (DAGs) of tasks.

Airflow is a workflow scheduler. It was developed by Airbnb to manage its complicated workflows.

References

Related Tags###

Similar workflow schedulers:

10104 questions
38
votes
3 answers

Starting Airflow webserver fails with sqlalchemy.exc.NoInspectionAvailable: No inspection system is available

Installation done properly. db initiated properly and trying to start the webserver shows the following error. I reinstalled everything but its still not working. I will appreciate if anyone help me. Console output: $:~/airflow# airflow webserver -p…
Masood Bashamaq
  • 393
  • 3
  • 4
38
votes
3 answers

How to set up Airflow Send Email?

I followed online tutorial to set up Email SMTP server in airflow.cfg as below: [email] email_backend = airflow.utils.email.send_email_smtp [smtp] # If you want airflow to send emails on retries, failure, and you want to use # the…
Peter Cui
  • 419
  • 1
  • 4
  • 8
37
votes
3 answers

Airflow structure/organization of Dags and tasks

My questions : What is a good directory structure in order to organize your dags and tasks? (the dags examples show only couple of tasks) I currently have my dags at the root of the dags folder and my tasks in separate directories, not sure is the…
nono
  • 2,262
  • 3
  • 23
  • 32
36
votes
5 answers

Is there a way to create/modify connections through Airflow API

Going through Admin -> Connections, we have the ability to create/modify a connection's params, but I'm wondering if I can do the same through API so I can programmatically set the connections airflow.models.Connection seems like it only deals with…
JChao
  • 2,178
  • 5
  • 35
  • 65
36
votes
4 answers

Airflow: How to SSH and run BashOperator from a different server

Is there a way to ssh to different server and run BashOperator using Airbnb's Airflow? I am trying to run a hive sql command with Airflow but I need to SSH to a different box in order to run the hive shell. My tasks should look like this: SSH to…
CMPE
  • 1,853
  • 4
  • 21
  • 37
35
votes
5 answers

Running google colab every day at a specific time

I recently have built a Python program that runs on Google Colaboratory, I need to run the program every day at a specific time, So Is there any way to schedule it to run on Google Colab?
Sado
  • 377
  • 1
  • 3
  • 5
35
votes
6 answers

Airflow tasks get stuck at "queued" status and never gets running

I'm using Airflow v1.8.1 and run all components (worker, web, flower, scheduler) on kubernetes & Docker. I use Celery Executor with Redis and my tasks are looks like: (start) -> (do_work_for_product1) ├ -> (do_work_for_product2) ├ ->…
Norio Akagi
  • 705
  • 1
  • 8
  • 22
34
votes
13 answers

Airflow not loading dags in /usr/local/airflow/dags

Airflow seems to be skipping the dags I added to /usr/local/airflow/dags. When I run airflow list_dags The output shows [2017-08-06 17:03:47,220] {models.py:168} INFO - Filling up the DagBag from…
Jeremy Lewi
  • 6,386
  • 6
  • 22
  • 37
34
votes
4 answers

Accessing configuration parameters passed to Airflow through CLI

I am trying to pass the following configuration parameters to Airflow CLI while triggering a dag run. Following is the trigger_dag command I am using. airflow trigger_dag -c '{"account_list":"[1,2,3,4,5]", "start_date":"2016-04-25"}' …
devj
  • 1,123
  • 2
  • 11
  • 24
33
votes
0 answers

psycopg2 could not translate host name

My airflow server periodically fails. When I check the gunicorn logs, the error before all works shutting down looks like this: OperationalError: (psycopg2.OperationalError) could not translate host name…
Brett
  • 719
  • 1
  • 10
  • 16
33
votes
2 answers

How to define Airflow DAG/task that shouldn't run periodically

The goal is pretty simple: I need to create a DAG for a manual task that should not run periodically, but only when admin presses the "Run" button. Ideally without a need to switch "unpause" and "pause" the DAG (you know someone will surely forget…
Ikar Pohorský
  • 4,617
  • 6
  • 39
  • 56
33
votes
4 answers

Example DAG gets stuck in "running" state indefinitely

In my first foray into airflow, I am trying to run one of the example DAGS that comes with the installation. This is v.1.8.0. Here are my steps: $ airflow trigger_dag example_bash_operator [2017-04-19 15:32:38,391] {__init__.py:57} INFO - Using…
gcbenison
  • 11,723
  • 4
  • 44
  • 82
32
votes
6 answers

Airflow signals SIGTERM to subprocesses unexpectedly

I am using the PythonOperator to call a function that parallelizes data engineering process as an Airflow task. This is done simply by wrapping a simple function with a callable wrapper function called by Airflow. def wrapper(ds, **kwargs): …
jiminssy
  • 2,149
  • 6
  • 28
  • 45
32
votes
4 answers

Apache Airflow DAG cannot import local module

I do not seem to understand how to import modules into an apache airflow DAG definition file. I would want to do this to be able to create a library which makes declaring tasks with similar settings less verbose, for instance. Here is the simplest…
fildred13
  • 2,280
  • 8
  • 28
  • 52
32
votes
7 answers

Running Job On Airflow Based On Webrequest

I wanted to know if airflow tasks can be executed upon getting a request over HTTP. I am not interested in the scheduling part of Airflow. I just want to use it as a substitute for Celery. So an example operation would be something like this. User…