I am a newbie to Airflow. But I am now working on how to throttle current jobs in Airflow. Is there someone that knows a little about concurrency or throttling in Airflow. Any suggestions could be helpful. Thanks a lot.
Asked
Active
Viewed 1,789 times
1 Answers
3
If you want to throttle tasks in a dag, you need to define its "concurrency" parameter.
"concurrency" defines how many running task instances a DAG is allowed to have, beyond which point things get queued.
If you want to throttle tasks globally, look into this lines of the config file
The amount of parallelism as a setting to the executor. This defines the max number of task instances that should run simultaneously on this airflow installation
parallelism = 32
And
The number of task instances allowed to run concurrently by the scheduler
dag_concurrency = 16
The first is global, the second is the concurrency default value for all dags