Questions tagged [airflow-scheduler]

The Apache Airflow scheduler monitors all tasks and all DAGs, and triggers the task instances whose dependencies have been met, and Apache Airflow is a platform to programmatically author, schedule and monitor workflows.

1257 questions
6
votes
1 answer

Airflow: Duplicate entry mysql integrity error when triggering a DAG run

I have two Airflow DAGs - scheduler and worker. Scheduler runs every minute and polls for new aggregation jobs and triggers worker jobs. You can find the code for scheduler job below. However out of over 6000 scheduler job runs 30 failed with the…
6
votes
1 answer

How to automatically reschedule airflow tasks

I am running an hourly process that picks up data from one location ("origin") and moves it to another ("destination"). for the most part, the data arrives to my origin at specific time and everything works fine, but there can be delays and when…
Nir Ben Yaacov
  • 1,182
  • 2
  • 17
  • 33
6
votes
3 answers

How to arrange Airflow Dags in the UI tidy in folders?

I'm using Airflow version 1.9.0, and I'm gonna have hundreds of Dags. is there a way to arrange the Airflow UI with folders, sub folders, and only then to put the Dags in it?
Maor Aharon
  • 312
  • 3
  • 14
6
votes
3 answers

"Error: /run/airflow doesn't exist. Can't create pidfile." when using systemd for Airflow webserver

I have configured my Airflow setup to run with systemd according to this. It was great for a couple of days but it has thrown some errors that I can't figure out how to fix. Running sudo systemctl start airflow-webserver.service doesn't really do…
6
votes
1 answer

Airflow task retried after failure despite retries=0

I have an Airflow environment running on Cloud Composer (3 n1-standard-1 nodes; image version: composer-1.4.0-airflow-1.10.0; config override: core catchup_by_default=False; PyPI packages: kubernetes==8.0.1). During a DAG run, a few tasks (all…
D Cohen
  • 157
  • 1
  • 7
6
votes
1 answer

How can I find out if a DAG is paused/unpaused in Airflow?

I would like to pause DAGs that are idle and redundant, How do I know which DAGs are unpaused and which are paused? So I have a list of DAGs that are to be unpaused using a bashcommand that executes airflow pause . I would like to know if…
AlphaCR
  • 806
  • 2
  • 9
  • 23
6
votes
2 answers

Airflow task with null status

I'am having an issue with airflow when running it on a 24xlarge machine on EC2. I must note that the parallelism level is 256. For some days the dagrun finishes with status 'failed' for two undetermined reasons : Some task has the status…
I.Chorfi
  • 507
  • 2
  • 5
  • 12
6
votes
3 answers

Airflow DAG in functions?

I am working in $AIRFLOW_HOME/dags. I have created the following files: - common |- __init__.py # empty |- common.py # common code - foo_v1.py # dag instanciation In common.py: default_args = ... def create_dag(project, version): …
pgrandjean
  • 676
  • 1
  • 9
  • 19
6
votes
5 answers

Airflow backfill stops if any task fails

I am using airflow cli's backfill command to manually run some backfill jobs. airflow backfill mydag -i -s 2018-01-11T16-00-00 -e 2018-01-31T23-00-00 --reset_dagruns --rerun_failed_tasks The dag interval is hourly and it has around 40 tasks.…
6
votes
2 answers

CeleryExecutor in Airflow are not parallelizing tasks in a subdag

We're using Airflow:1.10.0 and after some analysis why some of our ETL processes are taking so long we saw that the subdags are using a SequentialExecutor instead to use BaseExecutor or when we configure the CeleryExecutor. I would like to know if…
Flavio
  • 759
  • 1
  • 11
  • 24
6
votes
1 answer

Airflow trigger tasks only based on previous runs status

Is there a way to trigger the next task based on previous task run states. Scenario as below: Task1 - First task in m DAG Task2 - Run task2 only when task1 has succeeded Task3 - Run task 3 only when task3 has succeeded Task4 - Run task 4 only when…
mnk
  • 61
  • 1
  • 2
6
votes
2 answers

Airflow scheduler keep on Failing jobs without heartbeat

I'm new to airflow and i tried to manually trigger a job through UI. When I did that, the scheduler keep on logging that it is Failing jobs without heartbeat as follows: [2018-05-28 12:13:48,248] {jobs.py:1662} INFO - Heartbeating the…
6
votes
2 answers

Airflow: what's the standard way of delaying a day's DAG run?

I've got a DAG that's scheduled to run daily. In most scenarios, the scheduler would trigger this job as soon as the execution_date is complete, i.e., the next day. However, due to upstream delays, I only want to kick off the dag run for the…
conradlee
  • 12,985
  • 17
  • 57
  • 93
6
votes
3 answers

Apache Airflow not scheduling tasks

I have installed apache-airflow (version v1.9.0) along with python 2.7. To test whether its installed properly I tried to trigger a tutorial DAG from the interactive view in browser. The interface shows that the DAG is running, but the scheduler…
6
votes
1 answer

Airflow configuration in environment variable not working

I tried using ENV variable to configure connection urls, I have an ami that is preconfigured with alchemy_conn and broker_url etc, I have written environment variables to /etc/environment in the instances being spun up from the amis to override the…
Somasundaram Sekar
  • 5,244
  • 6
  • 43
  • 85