Questions tagged [airflow-scheduler]

The Apache Airflow scheduler monitors all tasks and all DAGs, and triggers the task instances whose dependencies have been met, and Apache Airflow is a platform to programmatically author, schedule and monitor workflows.

1257 questions
14
votes
2 answers

Minimum hardware requirements for Apache Airflow cluster

What are the minimum hardware requirements for setting up an Apache Airflow cluster. Eg. RAM, CPU, Disk etc for different types of nodes in the cluster.
Duleendra
  • 497
  • 1
  • 5
  • 21
13
votes
1 answer

Airflow scheduler fails to start with kubernetes executor

I am using using https://github.com/helm/charts/tree/master/stable/airflow helm chart and building v1.10.8 puckle/docker-airflow image with kubernetes installed on it and using that image in the helm chart, But I keep getting File…
Asav Patel
  • 1,113
  • 1
  • 7
  • 25
13
votes
1 answer

Submit and monitor SLURM jobs using Apache Airflow

I am using the Slurm job scheduler to run my jobs on a cluster. What is the most efficient way to submit the Slurm jobs and check on their status using Apache Airflow? I was able to use a SSHOperator to submit my jobs remotely and check on their…
stardust
  • 177
  • 2
  • 9
13
votes
2 answers

How to uninstall Airflow?

I am a newbie to Airflow. i have some trouble to remove Airflow v1.10.3 ,i am using pip3 version 8.1.1 on Ubuntu 16.04. I already tried to remove pip with sudo apt-get remove python3-pip and sudo apt-get remove pip3 and all his dependencies. and…
Yassin Abid
  • 315
  • 2
  • 5
  • 12
13
votes
3 answers

Airflow - Failed to fetch log file from worker. 404 Client Error: NOT FOUND for url

I am running Airflowv1.9 with Celery Executor. I have 5 Airflow workers running in 5 different machines. Airflow scheduler is also running in one of these machines. I have copied the same airflow.cfg file across these 5 machines. I have daily…
riyaB
  • 307
  • 1
  • 3
  • 21
13
votes
4 answers

Airflow depends_on_past for whole DAG

Is there a way in airflow of using the depends_on_past for an entire DagRun, not just applied to a Task? I have a daily DAG, and the Friday DagRun errored on the 4th task however the Saturday and Sunday DagRuns still ran as scheduled. Using…
chop4433
  • 211
  • 1
  • 2
  • 4
13
votes
1 answer

Error while connecting postgres db from airflow

Using: sql_alchemy_conn = db+postgresql://username:xxx@127.0.0.1:5432/airflow gives error: sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.dialects:db.postgresql and when using: sql_alchemy_conn =…
Javed
  • 5,904
  • 4
  • 46
  • 71
13
votes
3 answers

Airflow: Log file isn't local, Unsupported remote log location

I am not able see the logs attached to the tasks from the Airflow UI: Log related settings in airflow.cfg file are: remote_base_log_folder = base_log_folder = /home/my_projects/ksaprice_project/airflow/logs worker_log_server_port = 8793…
Javed
  • 5,904
  • 4
  • 46
  • 71
13
votes
2 answers

How to run one airflow task and all its dependencies?

I suspected that airflow run dag_id task_id execution_date would run all upstream tasks, but it does not. It will simply fail when it sees that not all dependent tasks are run. How can I run a specific task and all its dependencies? I am guessing…
itzjustricky
  • 423
  • 1
  • 4
  • 14
12
votes
4 answers

Airflow 1.10.3 SubDag can only run 1 task in parallel even the concurrency is 8

Recently, I upgrade Airflow from 1.9 to 1.10.3 (latest one). However I do notice a performance issue related to SubDag concurrency. Only 1 task inside the SubDag can be picked up, which is not the way it should be, our concurrency setting for the…
Kevin Li
  • 2,068
  • 15
  • 27
12
votes
1 answer

Tasks retrying more than specified retry in Airflow

I have recently upgraded my airflow to 1.10.2. Some tasks in the dag is running fine while some tasks are retrying more than the specified number of retries. One of the task logs shows - Starting attempt 26 of 2. Why is the scheduler scheduling it…
Vipul Pandey
  • 571
  • 1
  • 7
  • 14
12
votes
0 answers

Airflow: Logs are not showing in the UI

When I click on a task and then click on 'Log' button it doesn't display anything However, I have edited the config file to store them somewhere specific. base_log_folder = /var/log/airflow and in the UI it specifically says that the log for that…
shwifty chill
  • 338
  • 1
  • 3
  • 12
12
votes
5 answers

How to skip tasks on Airflow?

I'm trying to understand whether Airflow supports skipping tasks in a DAG for ad-hoc executions? Lets say my DAG graph look like this: task1 > task2 > task3 > task4 And I would like to start my DAG manually from task3, what is the best way of doing…
Maayan
  • 273
  • 1
  • 2
  • 11
12
votes
3 answers

What start_date should I use for a manually triggered DAG?

Many of the airflow example dags that have schedule_interval=None set a dynamic start date like airflow.utils.dates.days_ago(2) or datetime.utcnow(). However, the docs recommend against a dynamic start date: We recommend against using dynamic…
rcorre
  • 6,477
  • 3
  • 28
  • 33
12
votes
1 answer

Why is a task stuck and not executed in airflow?

I'm trying out airflow with the BigQueryOperator. I thought I would use google composer later on, but I want it running locally first. I have airflow up and running an BashOperator works fine, I can also run airflow test where task is…
Tomas Jansson
  • 22,767
  • 13
  • 83
  • 137
1 2
3
83 84