Questions tagged [airflow-scheduler]

The Apache Airflow scheduler monitors all tasks and all DAGs, and triggers the task instances whose dependencies have been met, and Apache Airflow is a platform to programmatically author, schedule and monitor workflows.

1257 questions
5
votes
1 answer

How to use the same DAG with multiple schedule_intervals in airflow?

I have one DAG that I pass a variety of configurations, and one of the settings I want to pass is how often it should run. For example, using the same DAG, I have two different RUNS. RUN A I want to run daily. RUN B I want to run weekly. Both of…
Adam K
  • 51
  • 3
5
votes
1 answer

simultaneous run of same DAG is possible in airflow?

We have one airflow DAG which is accepting input from user and performing some task. we want to run same DAG simultaneous with different input from user. we found multiple links for simultaneous task run but not able to get info about simultaneous…
Yug
  • 105
  • 1
  • 9
5
votes
0 answers

Should I implement Apache Airflow or only work with Celery

I have a Flask-based application using the Microservice Architecture. I have multiple services such as Scrapy for scraping product data. Multiple API Integration with the different service providers to pull and push data. ETL processes to message…
Rupesh Desai
  • 171
  • 2
  • 5
5
votes
0 answers

Airflow Kubernetes Pods Exception ERROR - (404) Reason: Not Found

I am looking for support to debug this Airflow KubernetesPodOperator Issue. We randomly get this error when the Airflow task executes. The job is almost finished and at the end of the job execution, the pods not found excception throw, (, In…
5
votes
1 answer

Connection pooling for external connections in Airflow

I am trying to find a way for connection pool management for external connections created in Airflow. Airflow version : 2.1.0 Python Version : 3.9.5 Airflow DB : SQLite External connections created : MySQL and Snowflake I know there are properties…
5
votes
3 answers

How to skip a task in airflow without skipping its downstream tasks?

Let’s say this is my dag: A >> B >> C If task B raises an exception, I want to skip the task instead of failing it. However, I don’t want to skip task C. I looked into AirflowSkipException and the soft_fail sensor but they both forcibly skip…
5
votes
0 answers

Airflow tasks stuck in queued state

We're running Airflow 1.10.12, with KubernetesExecutor and KubernetesPodOperator. In the past few days, we’re seeing tasks getting stuck in queued state for a long time (to be honest, unless we restart the scheduler, it will remain stuck in that…
Meny Issakov
  • 1,400
  • 1
  • 14
  • 30
5
votes
4 answers

Airflow 2.0 API response 403 Forbidden

I'm trying to trigger a new dag run via Airflow 2.0 REST API. If I am logged in to the Airflow webserver on the remote machine and I go to the swagger documentation page to test the API, the call is successful. If I log out or if the API call is…
Adeel Hashmi
  • 767
  • 1
  • 8
  • 20
5
votes
0 answers

Error: No response from Gunicorn master within 120 seconds . Shutting down webserver. Airflow-Webserver Service won't start

I want to monitor my airflow worker logs with the help of Prometheus. So I looked up on the internet and found statsd-exporter can help me. But, when I added the required configuration of statsd-exporter in Airflow.cfg , Service wont…
dataintransit
  • 174
  • 1
  • 7
5
votes
1 answer

How to trigger a Airflow task only when new partition/data in avialable in the AWS athena table using DAG in python?

I have a scenerio like a below : Trigger a Task 1 and Task 2 only when new data is avialable for them in source table ( Athena). Trigger for Task1 and Task2 should happen when a new data parition in a day. Trigger Task 3 only on the completion of…
5
votes
1 answer

Airflow Dependencies Blocking Task From Getting Scheduled

I have an airflow instance that had been running with no problem for 2 months until Sunday. There was a blackout in a system on which my airflow tasks depend and some tasks where queued for 2 days. After that we decided it was better to mark all the…
María
  • 51
  • 1
  • 2
5
votes
1 answer

How to write to local file path in in Airflow- MacOS

I am writing an Airflow pipeline which involves writing the results to a csv file located on my local file system. I am using MacOS and the file path is similar to /User/name/file_path/file_name.csv) Here is my code: from airflow import DAG from…
5
votes
0 answers

Airflow manually run DAG with ExternalTaskSensor

I have DAG with ExternalTaskSensor in it. I correctly set execution_delta and all work perfectly unless I want to run that DAG manually. My ExternalTaskSensor has state running and after timeout interval it's failed with exception…
Mikhail
  • 729
  • 1
  • 8
  • 16
5
votes
2 answers

does manual triggering of a airflow DAG interfere with the scheduled airflow trigger?

I want to use airflow DAG to run some jobs. I have scheduled the expression to every 25 mins, like */25 * * * *. for instance, it seems to run, like at 6:25, 6:50, and at 7 as well, but I want to run it at 7:15, not at 7. as an alternative, I want…
Anand Vidvat
  • 977
  • 7
  • 20
5
votes
1 answer

How to fix the error "AirflowException("Hostname of job runner does not match")"?

I'm running airflow on my computer (Mac AirBook, 1.6 GHz Intel Core i5 and 8 GB 2133 MHz LPDDR3). A DAG with several tasks, failed with below error. Checked several articles online but with little to no help. There is nothing wrong with the task…
ajay
  • 51
  • 1
  • 6