Questions tagged [airflow-scheduler]

The Apache Airflow scheduler monitors all tasks and all DAGs, and triggers the task instances whose dependencies have been met, and Apache Airflow is a platform to programmatically author, schedule and monitor workflows.

1257 questions
5
votes
3 answers

Airflow 1.10.1 - Change TimeZone

I am running airflow (1.10.1) inside a VM on GCP via docker. Already changed the local time of my VM and config (airflow.cfg) also set the default_zone of my country (America / Sao_Paulo) but it still continues in UTC time on the home screen and…
Felipe FB
  • 1,212
  • 6
  • 22
  • 55
5
votes
1 answer

Airflow - Incorrect Last Run

I just ran an airflow DAG. When I see the airflow last run date, it displays the last but last run date. It catches my attention when I hover over the "i" icon it shows the correct date. Is there any way to solve this? Sounds like nonsense but I end…
Felipe FB
  • 1,212
  • 6
  • 22
  • 55
5
votes
0 answers

Airflow run fails but airflow test works. Not sure why

So before asking this I've looked through the docs and had a look at Difference between "airflow run" and "airflow test" in Airflow to see if I can figure out why I am having this problem. I've got a few dags, all of which use LocalExecutor. Two of…
shwifty chill
  • 338
  • 1
  • 3
  • 12
5
votes
1 answer

Airflow GCP Connection Issue With Kubernetes - Fernet key must be 32 url-safe base64-encoded bytes

I am currently running Airflow on Kubernetes in Google Cloud GCP. I based my project off of docker-airflow. I am able to start the UI but when I try to create a connection for google cloud and submit the connection I get the following errors. …
Ryan Riopelle
  • 51
  • 1
  • 5
5
votes
2 answers

Use airflow hive operator and output to a text file

Hi I want to execute hive query using airflow hive operator and output the result to a file. I don't want to use INSERT OVERWRITE here. hive_ex = HiveOperator( task_id='hive-ex', hql='/sql/hive-ex.sql', hiveconfs={ 'DAY': '{{ ds…
user8617180
  • 267
  • 6
  • 20
5
votes
3 answers

How to stop DAG from backfilling? catchup_by_default=False and catchup=False does not seem to work and Airflow Scheduler from backfilling

The setting catchup_by_default=False in airflow.cfg does not seem to work. Also adding catchup=False to the DAG doesn't work neither. Here's how to reproduce the issue. I always start from a clean slate by running airflow resetdb. As soon as I…
Sam
  • 1,288
  • 1
  • 13
  • 22
5
votes
1 answer

Airflow Error - ValueError: Unable to configure handler 'file.processor'

I am running airflow 1.9.0 in codebuild container with Python 3.6.5, we execute the following commands and get error ValueError: Unable to configure handler 'file.processor': 'FileProcessorHandler' object has no attribute 'log' sudo sh…
5
votes
1 answer

Airflow Scheduler keeps crashing, DB connection error (Google Composer)

I've been using Google Composer for a while (composer-0.5.2-airflow-1.9.0), and had some problems with the Airflow scheduler. The scheduler container crashes sometimes, and it can get into a locked situation in which it cannot start any new tasks…
Dalar
  • 135
  • 1
  • 7
5
votes
2 answers

How to get airflow to add thousands of tasks to celery at one time?

I'm evaluating Airflow 1.9.0 for our distributed orchestration needs (using CeleryExecutor and RabbitMQ), and I am seeing something strange. I made a dag that has three stages: start fan out and run N tasks concurrently finish N can be large,…
Kevin Pauli
  • 8,577
  • 15
  • 49
  • 70
5
votes
3 answers

Airflow BashOperator OSError: [Errno 2] No such file or directory

I keep getting the same error from a scheduled BashOperator that is currently back-filling (it's over a month "behind"). [2018-06-10 22:06:33,558] {base_task_runner.py:115} INFO - Running: ['bash', '-c', u'airflow run dag_name task_name…
artdv
  • 774
  • 1
  • 8
  • 23
5
votes
3 answers

Scheduling dag runs in Airflow

Got a general query on Airflow Is it possible to have a dag file scheduled based on another dag file's schedule. For example, if I have 2 dags namely dag1 and dag2. I am trying to see if I can have dag2 run each time dag1 is successful else dag2…
Kevin Nash
  • 1,511
  • 3
  • 18
  • 37
5
votes
2 answers

Why are all of my Airflow dags one run behind?

I'm setting up Airflow right now and loving it, except for the fact that my dags are perpetually running behind. See the picture below - this was taken on 2/19 at 15:50 UTC, and you can see that for each of the dags, they should have run exactly one…
Aviv Goldgeier
  • 799
  • 7
  • 23
5
votes
1 answer

How to increase tasks queued per second?

I am trying to diagnose an under-performing airflow pipeline and am wondering what kind of performance I should expect out of the airflow scheduler in terms similar to "tasks scheduled per second". I have few queued jobs and many of my tasks finish…
7yl4r
  • 4,788
  • 4
  • 34
  • 46
5
votes
5 answers

web server of airflow is not running

m configuring email scheduler in Airflow in Django but its not working. error in terminal: airflow webserver [2017-12-29 10:52:17,614] {__init__.py:57} INFO - Using executor SequentialExecutor [2017-12-29 10:52:17,734] {driver.py:120} INFO -…
Hitesh Roy
  • 337
  • 2
  • 3
  • 11
5
votes
2 answers

Can airflow be used to run a never ending task?

Can we use an airflow dag to define a never-ending job (ie. a task which has a unconditional loop to consume stream data) by setting the task/dag timeout to None and manually trigger its running? Would having airflow monitor a never ending task…
FZF
  • 855
  • 4
  • 12
  • 29