5

I am pretty new to Airflow and trying to understand how should we set it up in our environment(on aws).

I read the Airflow uses Celery with redis broker. How is it different from Mesos? I have not used Celery before but I tried to set up celery-redis on my dev machine and it worked with ease. But adding new components means, add more monitoring.

Since we already use mesos for our cluster management, I am trying to think what am I missing if I dont chose celery and go with MesosExecutor instead?

Roger
  • 2,823
  • 3
  • 25
  • 32

4 Answers4

2

Using Celery is the more proven/stable approach at the moment.

For us, managing dependencies using containers is more convenient than managing dependencies on the Mesos instances, which is the case if you choose MesosExecutor. As such we are finding Celery more flexible.

We are currently using Celery + RabbitMQ but we will switch to MesosExecutor in the future though, as our codebase stabilises.

ImDarrenG
  • 2,315
  • 16
  • 24
1

Airflow with the CeleryExecuter doesn't necessarily need to use the Redis Broker. Any broker that celery can use is compatible with airflow, though it is recommended to either use the RabbitMQ broker or the Redis Broker.

Celery is quite different from Mesos. While airflow supports the MesosExecutor too, it is recommended to use the CeleryExecutor if you are planning to distribute the workers. From what I know, Airbnb uses the CeleryExecutor and actively maintains it.

Vineet Goel
  • 2,138
  • 1
  • 22
  • 28
1

For us, the MesosExecutor cannot be used. We need an abstraction level to handle dependencies for job, we cannot (and shouldn't) rely on any dependencied being installed on the mesos slaves. When Docker container and/or Mesos Container will be supported by MesosExecutor we can turn to it. Also, I like seeing the allocated workers inside Marathon. I am working on how to autoscale workers with Marathon.

Gaetan
  • 488
  • 4
  • 13
0

The MesosExecutor is still experimental at this stage and does not support running Docker containers, having different resource limits per task and probably many other limitations.

I plan to work on this though, it's a community effort and having spent some effort to deploy a Mesos cluster, I feel that adding Celery and another MQ broker is a waste of resources.

ludovicc
  • 444
  • 3
  • 11