Questions tagged [airflow]

Apache Airflow is a workflow management platform to programmatically author, schedule, and monitor workflows as directed acyclic graphs (DAGs) of tasks.

Airflow is a workflow scheduler. It was developed by Airbnb to manage its complicated workflows.

References

Related Tags###

Similar workflow schedulers:

10104 questions
3
votes
2 answers

SimpleHttpOperator Airflow, data templated

I'm trying to rendered correctly data inside a SimpleHttpOperator in Airflow with configuration that I send via dag_run result = SimpleHttpOperator( task_id="schema_detector", http_conn_id='schema_detector', …
Tizianoreica
  • 2,142
  • 3
  • 28
  • 43
3
votes
1 answer

airflow 2 / docker-compose: how to install Python dependencies for DAGs?

I have installed airflow 2.0.2 using docker-compose as described under https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html. I have researched quite some time, but I don't find a way to install python dependencies for my DAGs. I…
Requin
  • 467
  • 4
  • 16
3
votes
1 answer

Airflow PythonOperator inside PythonOperator

How can I run a PythonOperator inside another one PythonOperator? The idea is: to call "main" function as a PythonOperator and then run a few other PythonOperators inside and scheduler them The code is: def printFunction(value): time.sleep(5) …
Jjvzd
  • 35
  • 3
3
votes
1 answer

NameError: name '_mysql' is not defined -- On airflow start in MacOSX

There are numbers of articles on the titled question but none of them worked for me. The detailed error is as follows: Traceback (most recent call last): File "/Users/hiteshagarwal/Documents/venv/lib/python3.7/site-packages/MySQLdb/__init__.py",…
3
votes
4 answers

Airflow db init ERROR - Failed to add operation for GET /api/v1/connections

I am trying to install Airflow 2.0.1 with ansible on CentOS8 machine. Python version 3.8.1. I made pip 20.2.4 as suggested in Airflow docs. I am using postgresql and airflow db check is successful. But the db init task gives the following error. I…
Erkan Şirin
  • 1,935
  • 18
  • 28
3
votes
1 answer

Airflow - Stop DAG based on condition (skip remaining tasks after branch)

I am new on airflow, so I have a doubt here. I wanna run a DAG if a condition on first task is satisfied. If the condition is not satisfied I wanna to stop the dag after the first task. Example: # first task def get_number_func(**kwargs): …
bigdataadd
  • 191
  • 1
  • 10
3
votes
2 answers

How to run HDFS Copy commands using Airflow?

May I know how to execute HDFS copy commands on DataProc cluster using airflow. After the cluster is created using airflow, I have to copy few jar files from Google storage to the HDFS master node folder.
3
votes
1 answer

Airflow DAG Not following schedule

I have a DAG that is scheduled once a month. my problem is that the scheduler is not kicking off the job: args = { 'owner': 'Airflow', 'start_date': dates.days_ago(1), 'email': ['sinistersparrow1701@gmail.com', 'rich@offrs.com'], …
arcee123
  • 101
  • 9
  • 41
  • 118
3
votes
2 answers

How to avoid DAG Import Errors in Apache Airflow for worker node dependencies?

I'm working on an Apache Airflow, container based application. My environment is made of the following components: Airflow Scheduler container Airflow Webserver container Airflow Celery Flower container Airflow Worker container (1) etc. My…
Marco Miduri
  • 123
  • 1
  • 8
3
votes
4 answers

Airflow Sensor - timeout

tl;dr, Problem framing: Assuming I have a sensor poking with timeout = 24*60*60. Since the connection does time out occasionally, retries must be allowed. If the sensor now retries, the timeout variable is being applied to every new try with the…
Bennimi
  • 416
  • 5
  • 14
3
votes
1 answer

Airflow/Amazon EMR: The VPC/subnet configuration was invalid: Subnet is required : The specified instance type m5.xlarge can only be used in a VPC

I want to create an emr cluster triggered via Airflow on Amazon EMR. The emr cluster shows up in the UI of Amazon EMR but with an error saying: "The VPC/subnet configuration was invalid: Subnet is required : The specified instance type m5.xlarge can…
3
votes
0 answers

Reuse Airflow hooks with kubernetes operator

Is possible to import the hooks that airflow provided in the code (snowflake hook, aws hook, etc) in a kubernetes operator that run a python script? I may have the wrong idea of how to work with airflow. Example. I have Airflow running with a…
3
votes
1 answer

import error after upgrade to airflow2.0.2

Received an import error after upgrading to airflow2.0.2-python3.7 image. Package seems to be installed, not sure what is causing the issue and how to fix it. Tried to uninstalling and reinstalling the packages but that does not work…
hd99
  • 33
  • 1
  • 3
3
votes
1 answer

Accessing Airflow REST API in AWS Managed Workflows?

I have Airflow running in AWS MWAA, I would like to access REST API and there are 2 ways to do this but doesn't seem to work for me. Overriding api.auth_backend. This used to work and now AWS MWAA won't allow you to add this, it is consider as…
3
votes
2 answers

Multiple airflow schedulers

I am trying to install three node airflow cluster. Each node has airflow scheduler, airflow worker, airflow webserver, also it has celery, RabbitMQ cluster and Postgres multi master cluster(implemented with Bucardo). Versions of software: Airflow…
Denis
  • 31
  • 1
  • 3